0% found this document useful (0 votes)

397 views352 pages

It Officer Notes Ebook

IT Officer book

Uploaded by

Shivanand Jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

397 views352 pages

It Officer Notes Ebook

IT Officer book

Uploaded by

Shivanand Jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 352

DATABASE

Data: Facts, figures, statistics etc.

Record: Collection of related data items.
Table or Relation: Collection of related records.
Database: Collection of related relation/data. In database, data is organized strictly in row
and column format.The columns are called Fields, Attributes or Domains. The rows are
called Tuples or Records.

Features of Data In a Database:

1)Security
2)Consistency
3)Non-Redundancy
4)Shared
5)Independence
6)Persistence

DBMS(Database Management System)It is software that allows creation, definition and

manipulation of database.It is middle layer between data and program.

File System:
Stores permanent records in various files
Need application program to access and manipulate data.

Disadvantage of File System:

Data Redundancy
Data Inconsistency
Difficult in accessing data
Data Integrity
Low Security

Data redundancy: Data redundancy is the repetition or superfluity of data. Data

redundancy data is an common issue in computer data storage and database
systems.This data repetition may occur either if a field is repeated in two or more tables or
if the field is repeated within the table.Data can appear multiple times in a database for a
variety of reasons. A positive type of data redundancy works to safeguard data and
promote consistency. Many developers consider it acceptable for data to be stored in
multiple places. The key is to have a central, master field or space for this data, so that
there is a way to update all of the places where data is redundant through one central
access point. Otherwise, data redundancy can lead to big problems with data
inconsistency, where one update does not automatically update another field.For example,
a shop may have the same customers name appearing several times if that customer has
bought several different products at different dates.
Disadvantages Of Data Redundancy:
1)Increases the size of the database unnecessarily.
2)Causes data inconsistency.
3)Decreases efficiency of database.
4)May cause data corruption.

Data Isolation: The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the
database was in a consistent state before the execution of a transaction, it must remain
consistent after the execution of the transaction as well.As an example, if two people are
updating the same catalog item, it's not acceptable for one person's changes to be
"clobbered" when the second person saves a different set of changes. Both users should
be able to work in isolation, working as though he or she is the only user. Each set of
changes must be isolated from those of the other users.

Data Integrity is the assurance that information is unchanged from its source, and has not
been accidentally (e.g. through programming errors), or maliciously (e.g. through breaches
or hacks) modified, altered or destroyed. In another words, it concerns with the
completeness, soundness, and wholeness of the data that complies with the intention of
data creators.It's a logical property of the DB, independent of the actual data.

Data Consistency refers to the usability of the Data, and is mostly used in single site
environment. But still in single site environment, some problems may arise in Data
Consistency during recovery activities when original data is replaced by the backup copies
of Data. You have to make sure that you data is usable while backing up data.

Data Abstraction:To simplify the interaction of users and database, DBMS hides some
information which is not user interest is called Data Abstraction. So, developer hides
complexity from users and show Abstract view of data.

DBMS Architecture/3-Tier Architecture:

1)External/View Level:It is user's view of the database.This level describes the part of the
database that is relevant to each user.

2)Conceptual/Logical Level:

Describes what data is stored in the database and the relationship among the data.
Represent all entities, their attributes and their relationship
Constraints on the data
Security and Integrity information

3)Physical/Internal Level:

Describes how the data is stored in the database

Storage Space allocation for data and indexes
File System
Data compression and Data encryption techniques
Record Placement

Schemas:

It is overall description of the database. In three-level architecture, one schema at

each level.
Does not specify relationship among files.

Instances:
Collection of information stored in the database at a particular moment.

Sub-schema: It is a subset of schema and inherits the same property that the schema
has. It is an application programmer's or user view of the data items types and record
types which he or she uses.

Data Independence in DBMS:

Upper level are unaffected by changes in lower level.

Two Types of Data Independence:

a)Physical Data Independence:

Physical storage structure or devices can be changed without affecting conceptual

schema.
Modification done to improve performance.
It provide independence to conceptual schema and external schema
b)Logical Data Independence:

Conceptual schema can be changed without affecting external schema.

Structure of database is altered when modification done in conceptual schema.
It provide independence to external schema.

DBMS Components:
1)Hardware

Processor/main memory(used for execution)

Secondary Storage devices(for physical storage)

2)Data
3)Software
4)Users
5)Procedures(Set of rules for database management)

Types of Users:
a)Naive Users:
End Users of the database who work through menu driven application programs, where
the type and range of response is always indicated to the users.

b)Online Users:
Those users who may communicate with database directly through an online terminal.

c)Application Programmer:
Those users who are responsible for developing the application program.

d)DBA(Database Administrator)
DBA(Database Administrator):
DBA directs or performs all activities related to maintaining a successful database
environment.
Function of DBA:

Defining Conceptual Schema

Physical Database Design
Tuning database performance
Security and Integrity Check
Back up and Recovery Strategies
Improving query processing performance
Granting User Access

Database Languages:
1)DDL(Data Definition Language):
Deals with database schemas and description, how the data should reside in the
database.
Used to alter/modify a database or table structure and schema.

Command used in DDL:

Create
Alter
Drop
Rename
Truncate
Comment

2)DML(Data Manipulation Language)

Deals with data manipulation

These statements affects records in a table.

Command used in DML:

Update
Select
Insert
Delete
Merge
Call
Lock Table

Two Types of DML:

a)Procedural DML(Non Declarative)(How data is fetch)
b)Non-Procedural DML(Declarative )(What data is to be fetch)

3)DCL(Data Control Language)

Control the level of access that users have on database objects.
Command used in DCL:

Grant
Revoke

4)Transaction Language:
Control and manage transactions to maintain integrity of data within SQL statement.
Command used in Transaction Language:
Set Transaction
Commit
Savepoint
Rollback

Database Model:
Logical structure of a database and fundamental determines in which manner data can be
stored, organized and manipulated.

1)Hierarchical Model:

Data is organized in tree like structure, implying a single parent for each record.
Allows to one to many relationship

2)Network Model:

Allows many to many relationship in a graph like structure that allows multiple
parents.
Organise data using two fundamental concepts called records and sets.
3)Relational Data Model:
Collection of tables to represent data and the relationship among those data. Eg: Oracle,
Sybase.

4)Object Oriented Data Model:

Data and their relationship are organized or contained in a single structure known as
object.

Hierarchical ,Network and Relational data model is type of Record Based Model

ENTITY RELATIONSHIP MODEL DESIGN

1)Entity: It is "thing" or "object" in the real world that is distinguishable from all other
objects. An entity has a set of properties and values for some set of properties that may
uniquely identify an entity.

2)Entity Set:
Collection of entities all having same properties or attributes.

3)Attributes:
Each entity is described by set of attributes/properties. Attributes are descriptive properties
possessed by each member of an entity set.
For each attributes, there is set of permitted values called domain or value set of the
attributes.

Types of attributes:

1)Simple Attributes: Not divided into subpart eg: any unique number like 1234
2)Composite Attributes: Divided into subpart eg: Name is divided into first name, middle
name and last name.
3)Single Value Attribute: Single value for a particular entity eg: order_id
4)Multivalued Attribute: More than one value for a particular entity eg: Phone No.
5)Derived Attribute: Attribute value is dependent on some other attribute.Eg: Age

Null Values: Entity doesn't have value for the attribute.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Keys:
Key plays an important role in relational database; it is used for identifying unique rows
from table. It also establishes relationship among tables.
Types of Key:
1)Primary Key
2)Composite Key
3)Super Key
4)Candidate Key
5)Secondary Key
6)Foreign key

1)Primary key:
A primary is a column or set of columns in a table that uniquely identifies tuples (rows) in
that table.
A relation may contain many candidate keys.When the designer select one of them to
indentify a tuple in the relation,it becomes a primary key.It means that if there is only one
candidate key ,it will automatically selected as primary key.
2)Composite key
Key that consist of two or more attributes that uniquely identify an entity occurrence is
called Composite key. But any attribute that makes up the Composite key is not a simple
key in its own.

3)Super Key
A super key is the most general type of key.A super key is a set of one of more columns
(attributes) to uniquely identify rows in a table.Super key is a superset of Candidate key.

4)Candidate key
A candidate key is simply the "shortest" super key. Candidate Key are individual columns
in a table that qualifies for uniqueness of each row/tuple.Every table must have at least
one candidate key but at the same time can have several.

5)Secondary key
Out of all candidate keys, only one gets selected as primary key, remaining keys are
known as alternate or secondary keys.

6)Foreign key
A FOREIGN KEY in one table points to a PRIMARY KEY in another table.They act as a
cross-reference between tables.

Relationship: A relationship is an association among several entities.

Relationship Set: A relationship set is a set of relationships of the same type.

Relationship Type: A relationship type defines a set of associations among entities of the
different entity types.

Two Types of Relationship Constraints:

a)Cardinality Ratio(degree of relationship is also called cardinality)

b)Participation Constraint

a)Cardinality Ratio: Specifics the number of relationship instances that an entity can
participate in.The possible cardinality ratios are:
b)Participation Constraint: The participation constraint specifies whether the existence
of an entity depends on its being relate to another entity via the relationship type.There are
two types of participation constraints:

1)Total Participation Constraints(Existence dependency):The participation of an entity set

E in a relationship set R is said to be total if every entity in E participates in at least one
relationship in R. This participation is displayed as a double line connection.

2)Partial Dependency: If only some entities in E participate in relationship in R, the

participation of entity set E in relationship R is said to be partial.This participation is
displayed as a single line connecting.

Extended E-R Features:

1)Specialization:

Top down design process

We take higher level entity and add new attributes to it to produce lower level
entity.The lower level entities inherit the characteristics of higher level entity.
In terms of ER diagram, specialization is depicted by a triangle component labeled
ISA.

Consider an entity set person, with attributes name, street, and city. A person may be
further classied as one of the following:

customer

employee
2)Generalization:

Bottom-up design approach

Union of lower entity types to produce higher entity types.

3)Aggregation:

Aggregration is a process when relation between two entity is treated as a single

entity.Here the relation between Student and Course, is acting as an Entity in relation with
Subject.
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

Normalization: It is the process of removing redundant data from your tables in order to
improve storage efficiency, data integrity and scalability. This improvement is balanced
against an increase in complexity and potential performance losses from the joining of the
normalized tables at query-time.There are two goals of the normalization process:
eliminating redundant data (for example, storing the same data in more than one table)
and ensuring data dependencies make sense (only storing related data in a table). Both of
these are worthy goals as they reduce the amount of space a database consumes and
ensure that data is logically stored. Normalization is also called Bottom-up-approach,
because this technique requires full knowledge of every participating attribute and its
dependencies on the key attributes, if you try to add new attributes after normalization is
done, it may change the normal form of the database design.

Redundancy: Dependencies between attributes within a relation cause redundancy.

Without Normalization Problems: Without Normalization, it becomes difficult to handle

and update the database, without facing data loss. Insertion, Updation and Deletion
Anomalies are very frequent if Database is not normalized.There's clearly redundant
information stored here.
Insert Anomaly- Due to lack of data i.e., all the data available for insertion such
that null values in keys should be avoided. This kind of anomaly can seriously
damage a database
Update Anomaly- It is due to data redundancy i.e. multiple occurrences of same
values in a column. This can lead to inefficiency.
Deletion Anomaly - It leads to loss of data for rows that are not stored elsewhere.
It could result in loss of vital data.

On decomposition of a relation into smaller relations with fewer attributes on normalization

the resulting relations whenever joined must result in the same relation without any extra
rows. The join operations can be performed in any order. This is known as Lossless Join
decomposition. The resulting relations (tables) obtained on normalization should possess
the properties such as each row must be identified by a unique key, no repeating groups,
homogenous columns, each column is assigned a unique name etc.

Functional Dependency: The attributes of a table is said to be dependent on each other

when an attribute of a table uniquely identifies another attribute of the same table.If column
A of a table uniquely identifies the column B of same table then it can represented as A->B
(Attribute B is functionally dependent on attribute A).

Partial Function Dependency ; It is a form of Functional dependency that holds

on a set of attributes.Let us assume a relation R with attributes A, B, C, and D.
Also, assume that the set of functional dependencies F that hold on R as follows;

F = {A B, D C}.

From set of attributes F, we can derive the primary key. For R, the key can be (A,D), a
composite primary key. That means, AD BC, AD can uniquely identify B and C. But, for
this case A and D is not required to identify B or C uniquely. To identify B, attribute A is
enough. Likewise, to identify C, attribute D is enough. The functional dependencies AD
B or AD C are called as Partial functional dependencies.

Trivial Dependency ; The dependency of an attribute on a set of attributes is

known as Trivial Dependency if the set of attributes includes that attribute.

Consider a table with two columns Student_id and Student_Name.{Student_Id,

Student_Name} -> Student_Id is a trivial functional dependency as Student_Id is a subset
of {Student_Id, Student_Name}. That makes sense because if we know the values of
Student_Id and Student_Name then the value of Student_Id can be uniquely
determined.Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial
dependencies too.

Non-Trivial Dependency ; If a functional dependency X->Y holds true where Y is

not a subset of X then this dependency is called Non-Trivial Dependency.

An employee table with three attributes: emp_id, emp_name, emp_address.

The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)

On the other hand, the following dependencies are trivial:

{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]

Normalization has Five Normal Forms:

a)1NF

b)2NF

c)3NF

d)BCNF

e)(4NF)

f)5NF

a)1NF: A relation is considered to be in first normal form if all of its attributes have
domain that are indivisible or atomic.

A table is in 1NF if and only if its satisfies the following five conditions:

There is no top-to-bottom ordering to the rows.

There is no left-to-right ordering to the columns.
There are no duplicate rows.
Every row and column intersection contains exactly one value from the applicable
domain.
All columns are regular
Each attribute must contain only a single value from its predefined domain.

b)2NF:

Table is in 1NF (First normal form)

No non-prime attribute is dependent on the proper subset of any candidate key of
table.
Based on Fully Functional dependency.

An attribute that is not part of any candidate key is known as non-prime attribute.

c)3NF: A functional dependency is said to be transitive if it is indirectly formed by two

functional dependencies. For e.g.X -> Z is a transitive dependency if the following three
functional dependencies hold true:

X->Y
Y does not ->X
Y->Z

A table design is said to be in 3NF if both the following conditions hold:

Table must be in 2NF

Transitive functional dependency of non-prime attribute on any super key
should be removed.

An attribute that is not part of any candidate key is known as non-prime attribute.

In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:

X is a super keyof table

Y is a prime attribute of table

An attribute that is a part of one of the candidate keys is known as prime attribute.

d)BCNF:A relational schema R is considered to be in BoyceCodd normal form

(BCNF) if it is in 3NF, for every one of its dependencies X Y, one of the following
conditions holds true:

X Y is a non trivial functional dependency (i.e., Y is a subset of X)

X is a superkey for schema R

BCNF is more restrictive than 3NF.While decomposing relation to make them in BCNF we
may loose some dependencies i.e BCNF does not guarantee the dependency
preservation property.

Note:A relation with only two attributes is always in BCNF

e)4NF

It should meet all the requirement of 3NF

Attribute of one or more rows in the table should not result in more than one rows of
the same table leading to multi-valued dependencies.
Every relation in 4NF is in BCNF

f)5NF

Fifth normal form (5NF), also known as project-join normal form (PJ/NF) is a level
of database normalization designed to reduce redundancy in relational databases
recording multi-valued facts by isolating semantically related multiple relationships.
A table is said to be in the 5NF if and only if every non-trivial join dependency in it is
implied by the candidate keys.
A join dependency *{A, B, Z} on R is implied by the candidate key(s) of R if and only
if each of A, B, , Z is a superkey for R

Normalization De-Normalization

Normalization is the process of dividing the De-Normalization is the opposite process of

data into multiple tables, so that data normalization where the data from multiple
redundancy and data integrities are achieved. tables are combined into one table, so that
data retrieval will be faster.

It removes data redundancy i.e.; it eliminates It creates data redundancy i.e.; duplicate data
any duplicate data from the same table and may be found in the same table.
puts into a separate new table.

It maintains data integrity i.e.; any addition or It may not retain the data integrity.
deletion of data from the table will not create
any mismatch in the relationship of the tables.

It increases the number of tables in the It reduces the number of tables and hence
database and hence the joins to get the result. reduces the number of joins. Hence the
performance of the query is faster here
compared to normalized tables.

Even though it creates multiple tables, inserts, In this case all the duplicate data are at single
updates and deletes are more efficient in this table and care should be taken to
case. If we have to insert/update/delete any insert/delete/update all the related data in that
data, we have to perform the transaction in table. Failing to do so will create data integrity
that particular table. Hence there is no fear of issues.
data loss or data integrity.

Use normalized tables where more number of Use de-normalization where joins are
insert/update/delete operations are performed expensive and frequent query is executed on
and joins of those tables are not expensive. the tables.

Relational Algebra

Domain: set of relations

Based on set theory
Contains extensions to manipulate tables
Functional language
Procedural, i.e., order to operations, algorithm implicit in the functional evaluation

Relational Algebra Operations

Below are fundamental operations that are "complete". That is, this set of operations alone
can define any retrieval.

Select
Project
Rename
Union
Set Difference
Cartesian Product

Selection()
Selection is used to select required tuples of the relations.
for the above relation
(c>3)R
will select the tuples which have c more than 3.
Note: selection operator only selects the required tuples but does not display them. For
displaying, data projection operator is used.

Projection ()
Projection is used to project required column data from a relation.

Union (U)
Union operation in relational algebra is same as union operation in set theory, only constraint
is for union of two relation both relation must have same set of Attributes.

Set Difference (-)

Set Difference in relational algebra is same set difference operation as in set theory with the
constraint that both relation should have same set of attributes.

Rename ()
Rename is a unary operation used for renaming attributes of a relation.
(a/b)R will rename the attribute b of relation by a.

Cross Product (X)

Cross product between two relations let say A and B, so cross product between A X B will
results all the attributes of A followed by each attribute of B. Each record of A will pairs with
every record of B.

Relational Calculus

Relational calculus is a non procedural query language. It uses mathematical predicate

calculus instead of algebra. It provides the description about the query to get the result where
as relational algebra gives the method to get the result. It informs the system what to do with
the relation, but does not inform how to perform it.

Difference between Relational Algebra and Relational Calculus:

1. Relational algebra operations manipulate some relations and provide some expression in
the form of queries where as relational calculus are formed queries on the basis of pairs of
expressions.
2. RA have operator like join, union, intersection, division, difference, projection, selection
etc. where as RC has tuples and domain oriented expressions.
3. RA is procedural language where as RC is non procedural query system.
4. Expressive power of RA and RC are equivalent. This means any query that could be
expressed in RA could be expressed by formula in RC.
5. Any KC formula is translated in Algebric query.
6. There is modification which is easy in queries in RA than the RC.
7. RA formed the mathematical form and have no specificjuer1 language RC also has
mathematical form but has one query language QUEL.
8. Relational algebra is easy to manipulate and understand than RC.
9. RA queries are more powerful than the RC.
10. RC are formed WFFs where as RA does not form any formula.
11. RA is a procedural. That means we have write some conditions in order.
12. RC is a Non procedural. In here we have write the conditions in any order.

The tuple relational calculus is based on specifying a number of tuple variables. Each such
tuple variable normally ranges over a particular database relation. This means that the variable
may take any individual tuple from that relation as its value. A simple tuple relational calculus
query is of the form { t I COND(t)}, where '1' is a tuple variable and COND(t) is a conditional
expression involving '1'. The result of such a query is a relation that contains all the tuples
(rows) that satisfy COND(t).
For each tuple variable the range relation 'R' of 'to This value is specified by a condition of the
form R(t) .
A condition to select the required tuples from the relation.
A set of attributes to be retrieved. This set is called the requested attributes. The values of
these attributes for each selected combination of tuples. If the requested attribute list is not
specified, then all the attributes of the selected tuples are retrieved.

The domain calculus differs from the tuple calculus in the type of variables used in formulas.
In domain calculus the variables range over single values from domains of attributes rather
than ranging over tuples. To form a relation of degree 'n' for a query result, we must have 'n' of
these domain variables-one for each attribute.

SQL
SQL stands for Structured Query Language.
SQL is used to communicate with a database.
According to ANSI (American National Standards Institute), it is the standard language
for relational database management systems.
SQL statements are used to perform tasks such as update data on a database, or
retrieve data from a database.
Some common relational database management systems that use SQL are: Oracle,
Sybase, Microsoft SQL Server, Access, Ingres, etc.
Some database systems require a semicolon at the end of each SQL
statement.Semicolon is the standard way to separate each SQL statement in database
systems that allow more than one SQL statement to be executed in the same call to the
server.

SQL Data Types

Data type Description

CHARACTER(n) Character string. Fixed-length n

VARCHAR(n) or Character string. Variable length. Maximum length n

CHARACTER VARYING(n)

BINARY(n) Binary string. Fixed-length n

BOOLEAN Stores TRUE or FALSE values

VARBINARY(n) or Binary string. Variable length. Maximum length n

BINARY VARYING(n)

INTEGER(p) Integer numerical (no decimal). Precision p

SMALLINT Integer numerical (no decimal). Precision 5

INTEGER Integer numerical (no decimal). Precision 10

BIGINT Integer numerical (no decimal). Precision 19

DECIMAL(p,s) Exact numerical, precision p, scale s. Example: decimal(5,2) is a

number that has 3 digits before the decimal and 2 digits after the
decimal

NUMERIC(p,s) Exact numerical, precision p, scale s. (Same as DECIMAL)

FLOAT(p) Approximate numerical, mantissa precision p. A floating number in

base 10 exponential notation. The size argument for this type
consists of a single number specifying the minimum precision
REAL Approximate numerical, mantissa precision 7

FLOAT Approximate numerical, mantissa precision 16

DOUBLE PRECISION Approximate numerical, mantissa precision 16

DATE Stores year, month, and day values

TIME Stores hour, minute, and second values

TIMESTAMP Stores year, month, day, hour, minute, and second values

INTERVAL Composed of a number of integer fields, representing a period of

time, depending on the type of interval

ARRAY A set-length and ordered collection of elements

MULTISET A variable-length and unordered collection of elements

XML Stores XML data

Commands :

1)Select
The SELECT statement is used to select data from a database.The result is stored in a result
table, called the result-set.
SELECT column_name,column_name
FROM table_name;
OR
SELECT * FROM table_name;
Asterisk(*) means select all columns in the table.
2)Create Table

Used to create tables to store data. Integrity Constraints like primary key, unique key, foreign
key can be defined for the columns while creating the table. The integrity constraints can be
defined at column level or table level.
CREATE TABLE table_name
(
column_name1 data_type(size),
column_name2 data_type(size),
column_name3 data_type(size),
....
);

3)Create DB
Used to create a database.
CREATE DATABASE dbname;

4)Insert
Used to add new rows of data to a table.
INSERT INTO table_name
VALUES (value1,value2,value3,...);

INSERT INTO table_name (column1,column2,column3,...)

VALUES (value1,value2,value3,...);

5)Update
Used to modify the existing rows in a table.In the Update statement, WHERE clause identifies
the rows that get affected. If you do not include the WHERE clause, column values for all the
rows get affected.
UPDATE table_name
SET column1=value1,column2=value2,...
WHERE some_column=some_value;

6)Delete
Used to delete rows from a table.The WHERE clause in the sql delete command is optional
and it identifies the rows in the column that gets deleted. If you do not include the WHERE
clause all the rows in the table is deleted, so be careful while writing a DELETE query without
WHERE clause.
DELETE FROM table_name
WHERE some_column=some_value;

7)Alter
Used to change characteristics of a database.After creating a database, we can change its
properties by executing ALTER DATABASE statement. The user should have admin privileges
for modifying a database.
ALTER TABLE table_name
ADD column_name datatype.
8)Order By
Used to sort the result-set by one or more columns.The ORDER BY keyword sorts the records
in ascending order by default. To sort the records in a descending order, you can use the
DESC keyword.
SELECT column_name, column_name
FROM table_name
ORDER BY column_name ASC|DESC, column_name ASC|DESC;

9)Where
Used to extract only those records that fulfill a specified criterion.
SELECT column_name,column_name
FROM table_name
WHERE column_name operator value;1

10)Having Clause
Having clause is used to filter data based on the group functions. This is similar to WHERE
condition but is used with group functions. Group functions cannot be used in WHERE Clause
but can be used in HAVING clause.
If you want to select the department that has total salary paid for its employees more than
25000, the sql query would be like;

SELECT dept, SUM (salary) FROM employee GROUP BY dept HAVING SUM (salary) >

25000

11)Group By
The SQL GROUP BY Clause is used along with the group functions to retrieve data grouped
according to one or more columns.
For Example: If you want to know the total amount of salary spent on each department,
the query would be:
SELECT dept, SUM (salary) FROM employee GROUP BY dept;

12) Group functions are built-in SQL functions that operate on groups of rows and return
one value for the entire group. These functions are: COUNT, MAX, MIN, AVG, SUM,
DISTINCT

SQL COUNT (): This function returns the number of rows in the table that satisfies the
condition specified in the WHERE condition. If the WHERE condition is not specified, then
the query returns the total number of rows in the table.

SQL DISTINCT(): This function is used to select the distinct rows.

SQL MAX(): This function is used to get the maximum value from a column.
SQL MIN(): This function is used to get the minimum value from a column.

SQL AVG(): This function is used to get the average value of a numeric column.

SQL SUM(): This function is used to get the sum of a numeric column.

13) SQL Comparison Keywords

There are other comparison keywords available in sql which are used to enhance the
search capabilities of a sql query. They are "IN", "BETWEEN...AND", "IS NULL", "LIKE".
Comparision Operators Description

LIKE column value is similar to specified character(s).

IN column value is equal to any one of a specified set of values.

BETWEEN...AND column value is between two values, including the end values specified in
the range.

IS NULL column value does not exist.

14)Joins

A JOIN clause is used to combine rows from two or more tables, based on a related column
between them.

Different Types of JOINs in SQL:

(INNER) JOIN: Returns records that have matching values in both tables

SELECT column_name(s)
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name;

LEFT (OUTER) JOIN: Return all records from the left table, and the matched records
from the right table

SELECT column_name(s)
FROM table1
LEFT JOIN table2 ON table1.column_name = table2.column_name;

RIGHT (OUTER) JOIN: Return all records from the right table, and the matched
records from the left table.

SELECT column_name(s)
FROM table1
RIGHT JOIN table2 ON table1.column_name = table2.column_name;
FULL (OUTER) JOIN: Return all records when there is a match in either left or right
table

SELECT column_name(s)
FROM table1
FULL OUTER JOIN table2 ON table1.column_name = table2.column_name;

A self JOIN is a regular join, but the table is joined with itself.

SELECT column_name(s)
FROM table1 T1, table1 T2
WHERE condition;

15)AUTO INCREMENT fields are used for auto generating values for particular column
whenever new row is being inserted.Very often the primary key of a table needs to be
created automatically; we define that field as AUTO INCREMENT field.

16)SQL Views

A VIEW is a virtual table, through which a selective portion of the data from one or more
tables can be seen. Views do not contain data of their own. They are used to restrict
access to the database or to hide data complexity. A view is stored as a SELECT
statement in the database. DML operations on a view like INSERT, UPDATE, DELETE
affects the data in the original table upon which the view is based.
The Syntax to create a sql view is
CREATE VIEW view_name AS SELECT column_list FROM table_name [WHERE

condition];
view_name is the name of the VIEW.
The SELECT statement is used to define the columns and rows that you want to
display in the view.

17)SQL Index

Index in sql is created on existing tables to retrieve the rows quickly.When there are
thousands of records in a table, retrieving information will take a long time. Therefore
indexes are created on columns which are accessed frequently, so that the information
can be retrieved quickly. Indexes can be created on a single column or a group of
columns. When a index is created, it first sorts the data and then it assigns a ROWID for
each row.
CREATE INDEX index_name ON table_name (column_name1,column_name2...);
Transaction

A transaction is a set of changes that must all be made together. It is a program unit whose
execution mayor may not change the contents of a database. Transaction is executed as a
single unit. If the database was in consistent state before a transaction, then after execution of
the transaction also, the database must be in a consistate. For example, a transfer of money
from one bank account to another requires two changes to the database both must succeed or
fail together.
A transaction is a logical unit of database processing that includes one or more access
operations:

Read - retrieval of information from database

Write - insert or update in the database, delete data from the database

A transaction (set of operations) may be stand-alone specified in a high level language like
SQL submitted interactively, or may be embedded within a program (say, Java, Python or
C++). A users program may carry out many operations on the data retrieved from the
database, but the DBMS is only concerned about what data is read/written from/to the
database.

ACID Properties

The ACID model is one of the oldest and most important concepts of database theory. A
transaction may contain several low level tasks and further a transaction is a very small unit of
any program. There is a set of properties that guarantee that database transactions are
processed reliably. These properties are called ACID properties and are the subject to the
sections below:

Atomicity

Atomicity states that database modifications must follow an all or nothing rule. Though a
transaction involves several low level operations but this property states that a transaction
must be treated as an atomic unit, that is, either all of its operations are executed or none.
There must be no state in database where the transaction is left partially completed. States
should be defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction. A transaction must be fully complete, saved
(committed) or completely undone (rolled back).

Consistency

The consistency property ensures that the database remains in a consistent state before the
start of the transaction and after the transaction is over (whether successful or not). This
property states that after the transaction is finished, its database must remain in a consistent
state. There must not be any possibility that some data is incorrectly affected by the execution
of transaction.

If each transaction is consistent, and the database starts consistent, then the database ends
up consistent. If a transaction violates the databases consistency rules, the entire transaction
will be rolled back and the database will be restored to a state consistent with those rules.
Durability

Durability refers to the guarantee that once the user has been notified of success, the
transaction will persist, and not be undone. This property states that in any case all updates
made on the database will persist even if the system fails and restarts. If a transaction writes or
updates some data in database and commits that data will always be there in the database. If
the transaction commits but data is not written on the disk and the system fails, that data will
be updated once the system comes up.
Once a transaction commits, the system must guarantee that the results of its operations will
never be lost, in spite of subsequent failures.

Isolation

Isolation refers to the requirement that other operations cannot access or see the data in an
intermediate state during a transaction. This constraint is required to maintain the performance
as well as the consistency between transactions in a database. Thus, each transaction is
unaware of another transactions executing concurrently in the system.

In other words, in a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if it is the only transaction in the system. No transaction will affect
the existence of any other transaction.

States of Transaction
A transaction must be in one of the following states:

Active: the initial state, the transaction stays in this state while it is executing.
Partially committed: after the final statement has been executed.
Failed: when the normal execution can no longer proceed.
Aborted: after the transaction has been rolled back and the database has been restored to
its state prior to the start of the transaction.
Committed: after successful completion.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Concurrency Control in DBMS:

When multiple transactions are trying to access the same sharable resource, there could arise many
problems if the access control is not done properly. There are some important mechanisms to which
access control can be maintained. Earlier we talked about theoretical concepts like serializability, but
the practical concept of this can be implemented by using Locks and Timestamps.

Lock Based Protocol

A lock is nothing but a mechanism that tells the DBMS whether a particular data item is being used by
any transaction for read/write purpose. Since there are two types of operations, i.e. read and write,
whose basic nature are different, the locks for read and write operation may behave differently.
Read operation performed by different transactions on the same data item poses less of a challenge.
The value of the data item, if constant, can be read by any number of transactions at any given time.
Write operation is something different. When a transaction writes some value into a data item, the
content of that data item remains in an inconsistent state, starting from the moment when the writing
operation begins up to the moment the writing operation is over. If we allow any other transaction to
read/write the value of the data item during the write operation, those transaction will read an
inconsistent value or overwrite the value being written by the first transaction. In both the cases
anomalies will creep into the database.
The simple rule for locking can be derived from here. If a transaction is reading the content of a
sharable data item, then any number of other processes can be allowed to read the content of the
same data item. But if any transaction is writing into a sharable data item, then no other transaction
will be allowed to read or write that same data item.

Depending upon the rules we have found, we can classify the locks into two types.
Shared Lock: A transaction may acquire shared lock on a data item in order to read its content. The
lock is shared in the sense that any other transaction can acquire the shared lock on that same data
item for reading purpose.

Exclusive Lock: A transaction may acquire exclusive lock on a data item in order to both read/write
into it. The lock is excusive in the sense that no other transaction can acquire any kind of lock (either
shared or exclusive) on that same data item.

The relationship between Shared and Exclusive Lock can be represented by the following table which is
known as Lock Matrix.

Locks already existing

Shared Exclusive
Shared TRUE FALSE
Exclusive FALSE FALSE

Two Phase Locking Protocol

The use of locks has helped us to create neat and clean concurrent schedule. The Two Phase Locking
Protocol defines the rules of how to acquire the locks on a data item and how to release the locks.
The Two Phase Locking Protocol assumes that a transaction can only be in one of two phases.
Growing Phase: In this phase the transaction can only acquire locks, but cannot release any lock.
The transaction enters the growing phase as soon as it acquires the first lock it wants. From now on it
has no option but to keep acquiring all the locks it would need. It cannot release any lock at this phase
even if it has finished working with a locked data item. Ultimately the transaction reaches a point
where all the lock it may need has been acquired. This point is called Lock Point.
Shrinking Phase: After Lock Point has been reached, the transaction enters the shrinking phase. In
this phase the transaction can only release locks, but cannot acquire any new lock. The transaction
enters the shrinking phase as soon as it releases the first lock after crossing the Lock Point. From now
on it has no option but to keep releasing all the acquired locks.
There are two different versions of the Two Phase Locking Protocol. One is called the Strict Two Phase
Locking Protocol and the other one is called the Rigorous Two Phase Locking Protocol.

Strict Two Phase Locking Protocol

In this protocol, a transaction may release all the shared locks after the Lock Point has been reached,
but it cannot release any of the exclusive locks until the transaction commits. This protocol helps in
creating cascade less schedule.

Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses
either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among transactions at the time
of execution, whereas timestamp-based protocols start working as soon as a transaction is created.
Every transaction has a timestamp associated with it, and the ordering is determined by the age of the
transaction. A transaction created at 0002 clock time would be older than all other transactions that
come after it. For example, any transaction 'y' entering the system at 0004 is two seconds younger and
the priority would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This lets the system know
when the last read and write operation was performed on the data item.

Timestamp Ordering Protocol

The timestamp-ordering protocol ensures serializability among transactions in their conflicting read
and write operations. This is the responsibility of the protocol system that the conflicting pair of tasks
should be executed according to the timestamp values of the transactions.
The timestamp of transaction Ti is denoted as TS(Ti).
Read time-stamp of data-item X is denoted by R-timestamp(X).
Write time-stamp of data-item X is denoted by W-timestamp(X).

Deadlock
A deadlock is a condition wherein two or more tasks are waiting for each other in order to be finished
but none of the task is willing to give up the resources that other task needs. In this situation no task
ever gets finished and is in waiting state forever.

Deadlock Prevention
The DBMS verifies each transaction and sees if there can be deadlock situation upon execution of the
transaction. If it finds everything is fine, then allows the transaction to execute. If it finds that there can
be a deadlock, it never allows the transaction to execute. DBMS basically checks for the timestamp at
which a transaction has been initiated and orders the transactions based on it. If there are any
transactions at same time period with requesting each others resource, then it stops those
transactions before executing it. In above case, DBMS will never allow the transaction to execute
simultaneously. This method is suitable for large system.

There are different methods to prevent the deadlock

Wait-Die Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already held with a
conflicting lock by another transaction, then one of the two possibilities may occur

If TS(Ti) < TS(Tj) that is Ti, which is requesting a conflicting lock, is older than Tj then Ti is allowed to
wait until the data-item is available.

If TS(Ti) > TS(tj) that is Ti is younger than Tj then Ti dies. Ti is restarted later with a random delay but
with the same timestamp.

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already held with
conflicting lock by some another transaction, one of the two possibilities may occur

If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back that is Tiwounds Tj. Tj is restarted later with a
random delay but with the same timestamp.

If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.

This scheme, allows the younger transaction to wait; but when an older transaction requests an item
held by a younger one, the older transaction forces the younger one to abort and release the item.

In both the cases, the transaction that enters the system at a later stage is aborted.

Deadlock Avoidance:

It is always better to avoid deadlock in a system rather than aborting or restarting the transaction. This
is waste of time and resource. Wait-for-graph is one of the methods for detecting the deadlock
situation. But this method is suitable for smaller database. For large database deadlock prevention
method may help.

Wait-for Graph
This is a simple method available to track if any deadlock situation may arise. For each transaction
entering into the system, a node is created. When a transaction Ti requests for a lock on an item, say X,
which is held by some other transaction Tj, a directed edge is created from Ti to Tj. If Tj releases item X,
the edge between them is dropped and Ti locks the data item.

The system maintains this wait-for graph for every transaction waiting for some data items held by
others. The system keeps checking if there's any cycle in the graph.

Overview of Physical Storage Media

Storage media are classified by speed of access, cost per unit of data to buy the media, and
by the medium's reliability. Unfortunately, as speed and cost go up, the reliability does down.

1. Cache is the fastest and the most costly for of storage. The type of cache referred to
here is the type that is typically built into the CPU chip and is 256KB, 512KB, or 1MB.
Thus, cache is used by the operating system and has no application to database, per
se.
2. Main memory is the volatile memory in the computer system that is used to hold
programs and data. While prices have been dropping at a staggering rate, the
increases in the demand for memory have been increasing faster. Today's 32-bit
computers have a limitation of 4GB of memory. This may not be sufficient to hold the
entire database and all the associated programs, but the more memory available will
increase the response time of the DBMS. There are attempts underway to create a
system with the most memory that is cost effective, and to reduce the functionality of
the operating system so that only the DBMS is supported, so that system response can
be increased. However, the contents of main memory are lost if a power failure or
system crash occurs.
3. Flash memory is also referred to as electrically erasable programmable read-only
memory (EEPROM). Since it is small (5 to 10MB) and expensive, it has little or no
application to the DBMS.
4. Magnetic-disk storage is the primary medium for long-term on-line storage today.
Prices have been dropping significantly with a corresponding increase in capacity. New
disks today are in excess of 20GB. Unfortunately, the demands have been increasing
and the volume of data has been increasing faster. The organizations using a DBMS
are always trying to keep up with the demand for storage. This media is the most cost-
effective for on-line storage for large databases.
5. Optical storage is very popular, especially CD-ROM systems. This is limited to data
that is read-only. It can be reproduced at a very low-cost and it is expected to grow in
popularity, especially for replacing written manuals.
6. Tape storage is used for backup and archival data. It is cheaper and slower than all of
the other forms, but it does have the feature that there is no limit on the amount of data
that can be stored, since more tapes can be purchased. As the tapes get increased
capacity, however, restoration of data takes longer and longer, especially when only a
small amount of data is to be restored. This is because the retrieval is sequential, the
slowest possible method.

Magnetic Disks

A typical large commercial database may require hundreds of disks!

Physical Characteristics of Disks

Disks are actually relatively simple. There is normally a collection of platters on a spindle. Each
platter is coated with a magnetic material on both sides and the data is stored on the surfaces.
There is a read-write head for each surface that is on an arm assembly that moves back and
forth. A motor spins the platters at a high constant speed, (60, 90, or 120 revolutions per
seconds.)

The surface is divided into a set of tracks (circles). These tracks are divided into a set of
sectors, which is the smallest unit of data that can be written or read at one time. Sectors can
range in size from 31 bytes to 4096 bytes, with 512 bytes being the most common. A collection
of a specific track from both surfaces and from all of the platters is called a cylinder.

Platters can range in size from 1.8 inches to 14 inches. Today, 5 1/4 inches and 3 1/2 inches
are the most common, because they have the highest seek times and lowest cost.

A disk controller interfaces the computer system and the actual hardware of the disk drive. The
controller accepts high-level command to read or write sectors. The controller then converts
the commands in the necessary specific low-level commands. The controller will also attempt
to protect the integrity of the data by computing and using checksums for each sector. When
attempting to read the data back, the controller recalculates the checksum and makes several
attempts to correctly read the data and get matching checksums. If the controller is
unsuccessful, it will notify the operating system of the failure.

The controller can also handle the problem of eliminating bad sectors. Should a sector go bad,
the controller logically remaps the sector to one of the extra unused sectors that disk vendors
provide, so that the reliability of the disk system is higher. It is cheaper to produce disks with a
greater amount of sectors than advertised and then map out bad sectors than it is to produce
disks with no bad sectors or with extremely limited possibility of sectors going bad.

There are many different types of disk controllers, but the most common ones today are SCSI,
IDE, and EIDE.

One other characteristic of disks that provides an interesting performance is the distance from
the read-write head to the surface of the platter. The smaller this gap is means that data can
be written in a smaller area on the disk, so that the tracks can be closer together and the disk
has a greater capacity. Often the distance is measured in microns. However, this means that
the possibility of the head touching the surface is increased. When the head touches the
surface while the surface is spinning at a high speed, the result is called a "head crash", which
scratches the surface and defaces the head. The bottom line to this is that someone must
replace the disk.

Performance Measures of Disks

1. Seek time is the time to reposition the head and increases with the distance that the
head must move. Seek times can range from 2 to 30 milliseconds. Average seek
time is the average of all seek times and is normally one-third of the worst-case seek
time.
2. Rotational latency time is the time from when the head is over the correct track until
the data rotates around and is under the head and can be read. When the rotation is
120 rotations per second, the rotation time is 8.35 milliseconds. Normally, the average
rotational latency time is one-half of the rotation time.
3. Access time is the time from when a read or write request is issued to when the data
transfer begins. It is the sum of the seek time and latency time.
4. Data-transfer rate is the rate at which data can be retrieved from the disk and sent to
the controller. This will be measured as megabytes per second.
5. Mean time to failure is the number of hours (on average) until a disk fails. Typical
times today range from 30,000 to 800,000 hours (or 3.4 to 91 years).

Optimization of Disk-Block Access

Requests for disk I/O are generated by both the file system and by the virtual memory
manager found in most systems. Each request specifies the address on the disk to be
referenced; that address specifies is in the form of a block number. Each block is a contiguous
sequence of sectors from a single track of one platter and ranges from 512 bytes to several
kilobytes of data. The lower level file manager must convert block addresses into the
hardware-level cylinder, surface, and sector number.

Since access to data on disk is several orders of magnitude slower is access to data in main
memory; much attention has been paid to improving the speed of access to blocks on the disk.
This is also where more main memory can speed up the response time, by making sure that
the data needed is in memory when it is needed.

This is the same problem that is addressed in designing operating systems, to insure the best
response time from the file system manager and the virtual memory manager.
Scheduling. Disk-arm scheduling algorithms attempt to order accesses in an attempt
to increase the number of accesses that can be processed in a given amount of time.
The might include First-Come/First-Serve, Shortest Seek First, and elevator.
File organization. To reduce block-access time, data could be arranged on the disk in
the same order that it is expected to be retrieved. (This would be storing the data on
the disk in order based on the primary key.) At best, this starts to produce less and less
of a benefit, as there are more inserts and deletes. Also we have little control of where
on the disk things get stored. The more the data gets fragmented on the disk, the more
time it takes to locate it.
Nonvolatile write buffer. Using non-volatile memory (flash memory) can be used to
protect the data in memory from crashes, but it does increase the cost. It is possible
that the use of an UPS would be more effective and cheaper.
Log disk. You can use a disk for writing a sequential log.
Buffering. The more information you have in buffers in main memory, the more likely
you are to not have to get the information from the disk. However it is more likely that
more of the memory will be wasted with information not necessary.

RAID

RAIDs are Redundant Arrays of Inexpensive Disks. There are six levels of organizing these
disks:

0 -- Non-redundant Striping
1 -- Mirrored Disks
2 -- Memory Style Error Correcting Codes
3 -- Bit Interleaved Parity
4 -- Block Interleaved Parity
5 -- Block Interleaved Distributed Parity
6 -- P + Q Redundancy

Tertiary Storage

This is commonly optical disks and magnetic tapes.

Storage Access

A database is mapped into a number of different files, which are maintained by the underlying
operating system. Files are organized into block and a block may contain one or more data
item.

A major goal of the DBMS is to minimize the number of block transfers between the disk and
memory. Since it is not possible to keep all blocks in main memory, we need to manage the
allocation of the space available for the storage of blocks. This is also similar to the problems
encountered by the operating system, and can be in conflict with the operating system, since
the OS is concerned with processes and the DBMS is concerned with only one family of
processes.
Buffer Manager

Programs in a DBMS make requests (that is, calls) on the buffer manager when they need a
block from a disk. If the block is already in the buffer, the requester is passed the address of
the block in main memory. If the block in not in the buffer, the buffer manager first allocates
space in the buffer for the block, through out some other block, if required, to make space for
the new block. If the block that is to be thrown out has been modified, it must first be written
back to the disk. The internal actions of the buffer manager are transparent to the programs
that issue disk-block requests.

Replacement strategy. When there is no room left in the buffer, a block must be
removed from the buffer before a new one can be read in. Typically, operating systems
use a least recently use (LRU) scheme. There is also a Most Recent Used (MRU) that
can be more optimal for DBMSs.
Pinned blocks. A block that is not allowed to be written back to disk is said to be
pinned. This could be used to store data that has not been committed yet.
Forced output of blocks. There are situations in which it is necessary to write back to
the block to the disk, even though the buffer space is not currently needed. This might
be done during system lulls, so that when activity picks up, a write of a modified block
can be avoided in peak periods.

File Organization

Fixed-Length Records

Suppose we have a table that has the following organization:

type deposit = record

branch-name : char(22);
account-number : char(10);
balance : real;
end

If each character occupies 1 byte and a real occupies 8 bytes, then this record
occupies 40 bytes. If the first record occupies the first 40 bytes and the second record
occupies the second 40 bytes, etc. we have some problems.
It is difficult to delete a record, because there is no way to indicate that the record is
deleted. (At least one system automatically adds one byte to each record as a flag to
show if the record is deleted.) Unless the block size happens to be a multiple of 40
(which is extremely unlikely), some records will cross block boundaries. It would require
two block access to read or write such a record.

One solution might be to compress the file after each deletion. This will incur a major amount
of overhead processing, especially on larger files. Additionally, there is the same problem on
inserts!

Another solution would be to have two sets of pointers. One that would link the current record
to the next logical record (linked list) plus a free list (a list of free slots.) This increases the size
the file.
Variable-Length Records

We can use variable length records:

Storage of multiple record types in one file.

Record types that allow variable lengths for one or more fields
Record types that allow repeating fields.

A simple method for implementing variable-length records is to attach a special end-of-

record symbol at the end of each record. But this has problems:

To easy to reuse space occupied formerly by a deleted record.

There is no space in general for records to grow. If a variable-length record is updated
and needs more space, it must be moved. This can be very costly.

It could be solved:

By making a variable-length into a fixed length.

By using pointers to point to fixed length records, chained together by pointers.

As you can see, there is not an easy answer.

Organization of Records in Files

Heap File Organization

Any record can be placed anywhere in the file. There is no ordering of records and there is a
single file for each relation.

Sequential File Organization

Records are stored in sequential order based on the primary key.

Hashing File Organization

Any record can be placed anywhere in the file. A hashing function is computed on some
attribute of each record. The function specifies in which block the record should be placed.

Clustering File Organization

Several different relations can be stored in the same file. Related records of the different
relations can be stored in the same block.

Data Dictionary Storage

A RDBMS needs to maintain data about the relations, such as the schema. This is stored in a
data dictionary (sometimes called a system catalog):
Names of the relations
Names of the attributes of each relation
Domains and lengths of attributes
Names of views, defined on the database, and definitions of those views
Integrity constraints
Names of authorized users
Accounting information about users
Number of tuples in each relation
Method of storage for each relation (clustered/non-clustered)
Name of the index
Name of the relation being indexed
Attributes on which the index in defined
Type of index formed

Indexing

The main goal of designing the database is faster access to any data in the database and
quicker insert/delete/update to any data. This is because no one likes waiting. When a
database is very huge, even a smallest transaction will take time to perform the action. In order
to reduce the time spent in transactions, Indexes are used. Indexes are similar to book
catalogues in library or even like an index in a book. What it does? It makes our search simpler
and quicker. Same concept is applied here in DBMS to access the files from the memory.

When records are stored in the primary memory like RAM, accessing them is very easy and
quick. But records are not limited in numbers to store in RAM. They are very huge and we
have to store it in the secondary memories like hard disk. As we have seen already, in memory
we cannot store records like we see tables. They are stored in the form of files in different
data blocks. Each block is capable of storing one or more records depending on its size.

When we have to retrieve any required data or perform some transaction on those data, we
have to pull them from memory, perform the transaction and save them back to the memory. In
order to do all these activities, we need to have a link between the records and the data blocks
so that we can know where these records are stored. This link between the records and the
data block is called index. It acts like a bridge between the records and the data block.

Indexing is defined based on its indexing attributes. Indexing can be of the following
types

Primary Index Primary index is defined on an ordered data file. The data file is
ordered on a key field. The key field is generally the primary key of the relation.

Secondary Index Secondary index may be generated from a field which is a

candidate key and has a unique value in every record, or a non-key with duplicate
values.
Clustering Index Clustering index is defined on an ordered data file. The data file is
ordered on a non-key field.

Ordered Indexing is of two types

Dense Index
Sparse Index

Dense Index

In this case, indexing is created for primary key as well as on the columns on which we
perform transactions. That means, user can fire query not only based on primary key column.
He can query based on any columns in the table according to his requirement. But creating
index only on primary key will not help in this case. Hence index on all the search key columns
are stored. This method is called dense index.

Sparse Index

In order to address the issues of dense indexing, sparse indexing is introduced. In this method
of indexing, range of index columns store the same data block address. And when data is to
be retrieved, the block address will be fetched linearly till we get the requested data.

Multilevel Index
Index records comprise search-key values and data pointers. Multilevel index is stored on the
disk along with the actual database files. As the size of the database grows, so does the size
of the indices. There is an immense need to keep the index records in the main memory so as
to speed up the search operations. If single-level index is used, then a large size index cannot
be kept in memory which leads to multiple disk accesses.

B+ Tree

A B-tree is a method of placing and locating files (called records or keys) in a database. (The
meaning of the letter B has not been explicitly defined.) The B-tree algorithm minimizes the
number of times a medium must be accessed to locate a desired record, thereby speeding up
the process.

B-trees are preferred when decision points, called nodes, are on hard disk rather than in
random-access memory (RAM). It takes thousands of times longer to access a data element
from hard disk as compared with accessing it from RAM, because a disk drive has mechanical
parts, which read and write data far more slowly than purely electronic media. B-trees save
time by using nodes with many branches (called children), compared with binary trees, in
which each node has only two children. When there are many children per node, a record can
be found by passing through fewer nodes than if there are two children per node.

In a tree, records are stored in locations called leaves. This name derives from the fact that
records always exist at end points; there is nothing beyond them. The maximum number of
children per node is the order of the tree. The number of required disk accesses is the depth.
The image at left shows a binary tree for locating a particular record in a set of eight leaves.
The image at right shows a B-tree of order three for locating a particular record in a set of eight
leaves (the ninth leaf is unoccupied, and is called a null). The binary tree at left has a depth of
four; the B-tree at right has a depth of three. Clearly, the B-tree allows a desired record to be
located faster, assuming all other system parameters are identical. The tradeoff is that the
decision process at each node is more complicated in a B-tree as compared with a binary tree.
A sophisticated program is required to execute the operations in a B-tree. But this program is
stored in RAM, so it runs fast.

In a practical B-tree, there can be thousands, millions, or billions of records. Not all leaves
necessarily contain a record, but at least half of them do. The difference in depth between
binary-tree and B-tree schemes is greater in a practical database than in the example
illustrated here, because real-world B-trees are of higher order (32, 64, 128, or more).
Depending on the number of records in the database, the depth of a B-tree can and often does
change. Adding a large enough number of records will increase the depth; deleting a large
enough number of records will decrease the depth. This ensures that the B-tree functions
optimally for the number of records it contains.

Hashing

Hash File organization method is the one where data is stored at the data blocks whose
address is generated by using hash function. The memory location where these records are
stored is called as data block or data bucket. This data bucket is capable of storing one or
more records.

The hash function can use any of the column value to generate the address. Most of the time,
hash function uses primary key to generate the hash index address of the data block. Hash
function can be simple mathematical function to any complex mathematical function. We can
even consider primary key itself as address of the data block. That means each row will be
stored at the data block whose address will be same as primary key.

Hash Organization
Bucket A hash file stores data in bucket format. Bucket is considered a unit of
storage. A bucket typically stores one complete disk block, which in turn can store one
or more records.

Hash Function A hash function, h, is a mapping function that maps all the set of
search-keys K to the address where actual records are placed. It is a function from
search keys to bucket addresses.

As a simple example of the using of hashing in databases, a group of people could be

arranged in a database like this:
Abernathy, Sara Epperdingle, Roscoe Moore, Wilfred Smith, David (and many more sorted
into alphabetical order)

Each of these names would be the key in the database for that person's data. A database
search mechanism would first have to start looking character-by-character across the name for
matches until it found the match (or ruled the other entries out). But if each of the names were
hashed, it might be possible (depending on the number of names in the database) to generate
a unique four-digit key for each name. For example:

7864 Abernathy, Sara 9802 Epperdingle, Roscoe 1990 Moore, Wilfred 8822 Smith, David
(and so forth)

A search for any name would first consist of computing the hash value (using the same hash
function used to store the item) and then comparing for a match using that value. It would, in
general, be much faster to find a match across four digits, each having only 10 possibilities,
than across an unpredictable value length where each character had 26 possibilities.

There are two types of hash file organizations Static and Dynamic Hashing.

Static Hashing
In static hashing, when a search-key value is provided, the hash function always computes
the same address. For example, if mod-4 hash function is used, then it shall generate only 5
values. The output address shall always be same for that function. The number of buckets
provided remains unchanged at all times

Bucket Overflow
The condition of bucket-overflow is known as collision. This is a fatal state for any static hash
function. In this case, overflow chaining can be used.

Overflow Chaining When buckets are full, a new bucket is allocated for the same
hash result and is linked after the previous one. This mechanism is called Closed
Hashing.

Linear Probing When a hash function generates an address at which data is already
stored, the next free bucket is allocated to it. This mechanism is called Open Hashing.
Dynamic Hashing
The problem with static hashing is that it does not expand or shrink dynamically as the size of
the database grows or shrinks. Dynamic hashing provides a mechanism in which data
buckets are added and removed dynamically and on-demand. Dynamic hashing is also
known as extended hashing.

Hash function, in dynamic hashing, is made to produce a large number of values and only a
few are used initially.

The prefix of an entire hash value is taken as a hash index. Only a portion of the hash value is
used for computing bucket addresses. Every hash index has a depth value to signify how
many bits are used for computing a hash function. These bits can address 2n buckets. When
all these bits are consumed that is, when all the buckets are full then the depth value is
increased linearly and twice the buckets are allocated.

Hashing is not favorable when the data is organized in some ordering and the queries require
a range of data. When data is discrete and random, hash performs the best.

Hashing algorithms have high complexity than indexing. All hash operations are done in
constant time.

Data Backup:

In a computer system we have primary and secondary memory storage. Primary memory
storage devices - RAM is a volatile memory which stores disk buffer, active logs, and other
related data of a database. It stores all the recent transactions and the results too. When a
query is fired, the database first fetches in the primary memory for the data, if it does not exist
there, then it moves to the secondary memory to fetch the record. Fetching the record from
primary memory is always faster than secondary memory. What happens if the primary
memory crashes? All the data in the primary memory is lost and we cannot recover the
database.

In such cases, we can follow any one the following steps so that data in the primary memory
are not lost.

We can create a copy of primary memory in the database with all the logs and buffers,
and are copied periodically into database. So in case of any failure, we will not lose
all the data. We can recover the data till the point it is last copied to the database.
We can have checkpoints created at several places so that data is copied to the
database.
Suppose the secondary memory itself crashes. What happens to the data stored in it? All the
data are lost and we cannot recover. We have to think of some alternative solution for this
because we cannot afford for loss of data in huge database.

There are three methods used to back up the data in the secondary memory, so that it can be
recovered if there is any failure.

Remote Backup: - Database copy is created and stored in the remote network. This
database is periodically updated with the current database so that it will be in sync
with data and other details. This remote database can be updated manually called
offline backup. It can be backed up online where the data is updated at current and
remote database simultaneously. In this case, as soon as there is a failure of current
database, system automatically switches to the remote database and starts
functioning. The user will not know that there was a failure.
In the second method, database is copied to memory devices like magnetic tapes and
kept at secured place. If there is any failure, the data would be copied from these
tapes to bring the database up.
As the database grows, it is an overhead to backup whole database. Hence only the
log files are backed up at regular intervals. These log files will have all the
information about the transaction being made. So seeing these log files, database
can be recovered. In this method log files are backed up at regular intervals, and
database is backed up once in a week.

There are two types of data backup physical data backup and Logical data backup. The
physical data backup includes physical files like data files, log files, control files, redo- undo
logs etc. They are the foundation of the recovery mechanism in the database as they provide
the minute details about the transactions and modification to the database

Logical backup includes backup of logical data like tables, views, procedures, functions etc.
Logical data backup alone is not sufficient to recover the database as they provide only the
structural information. The physical data back actually provides the minute details about the
database and is very much important for recovery.

Data Recovery:

Data recovery is the process of restoring data that has been lost, accidentally
deleted, corrupted or made inaccessible.In enterprise IT, data recovery typically refers to the
restoration of data to a desktop, laptop, server or external storage system from a backup.

Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as
follows
Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it
cant go any further. This is called transaction failure where only a few transactions or
processes are hurt.

Reasons for a transaction failure could be

Logical errors Where a transaction cannot complete because it has some code
error or any internal error condition.

System errors Where the database system itself terminates an active transaction
because the DBMS is not able to execute it, or it has to stop because of some system
condition. For example, in case of deadlock or resource unavailability, the system
aborts an active transaction.

System Crash
There are problems external to the system that may cause the system to stop abruptly
and cause the system to crash. For example, interruptions in power supply may cause the
failure of underlying hardware or software failure.

Examples may include operating system errors.

Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or
storage drives used to fail frequently.

Disk failures include formation of bad sectors, unreachability to the disk, disk head crash or
any other failure, which destroys all or a part of disk storage.

Recovery and Atomicity

When a system crashes, it may have several transactions being executed and various files
opened for them to modify the data items. Transactions are made of various operations,
which are atomic in nature. But according to ACID properties of DBMS, atomicity of
transactions as a whole must be maintained, that is, either all the operations are executed or
none.

When a DBMS recovers from a crash, it should maintain the following

It should check the states of all the transactions, which were being executed.

A transaction may be in the middle of some operation; the DBMS must ensure the
atomicity of the transaction in this case.

It should check whether the transaction can be completed now or it needs to be rolled
back.

No transactions would be allowed to leave the DBMS in an inconsistent state.

There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction

Maintaining the logs of each transaction, and writing them onto some stable storage
before actually modifying the database.

Maintaining shadow paging, where the changes are done on a volatile memory, and
later, the actual database is updated.

Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a
transaction. It is important that the logs are written prior to the actual modification and stored
on a stable storage media, which is failsafe.

Log-based recovery works as follows

The log file is kept on a stable storage media.

When a transaction enters the system and starts execution, it writes a log about it.

XML:

XML is a markup language, which is mainly used to represent the structured data. Structured
data is the one which contains the data along with the tag / label to indicate what is that data. It
is like a data with tag as a column name in RDBMS. Hence the same is used to document the
data in DDB. One may think why we need to XML rather than simply documenting the data
with simple tags as shown in the contact detail example. XML provides lots of features to
handle the structured data within the document.

XML is the markup language which serves the structured data over the internet, which
can be viewed by the user easily as well as quickly.
It supports lots of different types of applications.

It is easy to write programs which process XMLs.

This XML does not have any optional feature so that its complexity can increase.
Hence XML is a simple language which any user can use with minimal knowledge.

XML documents are created very quickly. It does not need any thorough analysis,
design and development phases like in RDBMS. In addition, one should be able to
create and view XML in notepad too.

All these features of XML make it unique and ideal to represent DDB.

There are two major types of XML databases:

XML- enabled

Native XML (NXD)

XML- Enabled Database

XML enabled database is nothing but the extension provided for the conversion of XML
document. This is relational database, where data are stored in tables consisting of rows and
columns. The tables contain set of records, which in turn consist of fields.

Native XML Database

Native XML database is based on the container rather than table format. It can store large
amount of XML document and data. Native XML database is queried by the XPath-
expressions.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
A Namespace is a set of unique names. Namespace is a mechanisms by which element and
attribute name can be assigned to group. The Namespace is identified by URI(Uniform
Resource Identifiers).
Syntax
The Namespace starts with the keyword xmlns.

The word name is the Namespace prefix.

The URL is the Namespace identifier.

Distributed Database:

A distributed database is a database in which portions of the database are stored in multiple physical
locations and processing is distributed among multiple database nodes.

A centralized distributed database management system (DDBMS) integrates the data logically so it can
be managed as if it were all stored in the same location. The DDBMS synchronizes all the data
periodically and ensures that updates and deletes performed on the data at one location will be
automatically reflected in the data stored elsewhere.

Distributed databases can be homogenous or heterogeneous. In a homeogenous distributed

database system, all the physical locations have the same underlying hardware and run the
same operating systems and database applications. In a heterogeneous distributed database,
the hardware, operating systems or database applications may be different at each of the
locations.

Distributed DBMS Architectures

DDBMS architectures are generally developed depending on three parameters

Distribution It states the physical distribution of data across the different sites.

Autonomy It indicates the distribution of control of the database system and the
degree to which each constituent DBMS can operate independently.

Heterogeneity It refers to the uniformity or dissimilarity of the data models, system

components and databases.

Architectural Models
Some of the common architectural models are
Client - Server Architecture for DDBMS
Peer - to - Peer Architecture for DDBMS
Multi - DBMS Architecture
Client - Server Architecture for DDBMS
This is a two-level architecture where the functionality is divided into servers and clients. The
server functions primarily encompass data management, query processing, optimization and
transaction management. Client functions include mainly user interface. However, they have
some functions like consistency checking and transaction management.

The two different client - server architecture are

Single Server Multiple Client

Multiple Server Multiple Client

Design Alternatives
The distribution design alternatives for the tables in a DDBMS are as follows

Non-replicated and non-fragmented

Fully replicated
Partially replicated
Fragmented
Mixed
Non-replicated & Non-fragmented
In this design alternative, different tables are placed at different sites. Data is placed so that it
is at a close proximity to the site where it is used most. It is most suitable for database
systems where the percentage of queries needed to join information in tables placed at
different sites is low. If an appropriate distribution strategy is adopted, then this design
alternative helps to reduce the communication cost during data processing.

Fully Replicated
In this design alternative, at each site, one copy of all the database tables is stored. Since,
each site has its own copy of the entire database, queries are very fast requiring negligible
communication cost. On the contrary, the massive redundancy in data requires huge cost
during update operations. Hence, this is suitable for systems where a large number of queries
is required to be handled whereas the number of database updates is low.

Partially Replicated
Copies of tables or portions of tables are stored at different sites. The distribution of the tables
is done in accordance to the frequency of access. This takes into consideration the fact that
the frequency of accessing the tables vary considerably from site to site. The number of
copies of the tables (or portions) depends on how frequently the access queries execute and
the site which generate the access queries.

Fragmented
In this design, a table is divided into two or more pieces referred to as fragments or partitions,
and each fragment can be stored at different sites. This considers the fact that it seldom
happens that all data stored in a table is required at a given site. Moreover, fragmentation
increases parallelism and provides better disaster recovery. Here, there is only one copy of
each fragment in the system, i.e. no redundant data.

The three fragmentation techniques are

Vertical fragmentation
Horizontal fragmentation
Hybrid fragmentation
Mixed Distribution
This is a combination of fragmentation and partial replications. Here, the tables are initially
fragmented in any form (horizontal or vertical), and then these fragments are partially
replicated across the different sites according to the frequency of accessing the fragments.

Shadow Paging in DBMS:

This is the method where all the transactions are executed in the primary memory or the
shadow copy of database. Once all the transactions completely executed, it will be updated to
the database. Hence, if there is any failure in the middle of transaction, it will not be reflected in
the database. Database will be updated after all the transaction is complete. A database
pointer will be always pointing to the consistent copy of the database, and copy of the
database is used by transactions to update. Once all the transactions are complete, the DB
pointer is modified to point to new copy of DB, and old copy is deleted. If there is any failure
during the transaction, the pointer will be still pointing to old copy of database, and shadow
database will be deleted. If the transactions are complete then the pointer is changed to point
to shadow DB, and old DB is deleted.

ORACLE
An Oracle database is a collection of data treated as a unit. The purpose of a database is
to store and retrieve related information. A database server is the key to solving the
problems of information management.

Oracle 9i is an Object/Relational Database Management System specifically

designed for e-commerce.
Oracle 9i, a version of Oracle database. The letter i refers to the internet.
It can scale ten thousands of concurrent users.
It includes Oracle 9i Application server and Oracle 9i Database that provide a
comprehensive high-performance infrastructure for Internet Applications.
It supports client-server and web based applications.
The maximum Database holding capacity of Oracle 9i is upto 512 peta bytes(PB).[1
Peta Byte = 1000 Tera Byte]
It offers Data warehousing features and also many management features.

We can set primary key on table up to 16 columns of table in oracle 9i as well as in Oracle
10g.
The maximum number of data files in Oracle 9i and Oracle 10g Database is 65,536.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Oracle 9i Architecture:

Oracle Storage Structures:

An essential task of a relational database is data storage. This section briefly describes the
physical and logical storage structures used by Oracle Database.

PHYSICAL STORAGE STRUCTURES

The physical database structures are the files that store the data. When you execute the
SQL command CREATE DATABASE, the following files are created:

Data files

Every Oracle database has one or more physical data files, which contain all the
database data. The data of logical database structures, such as tables and indexes,
is physically stored in the data files.

Control files

Every Oracle database has a control file. A control file contains metadata specifying
the physical structure of the database, including the database name and the names
and locations of the database files.

Online redo log files

Every Oracle Database has an online redo log, which is a set of two or
more online redo log files. An online redo log is made up of redo entries (also
called redo records), which record all changes made to data.
LOGICAL STORAGE STRUCTURES

This section discusses logical storage structures. The following logical storage structures
enable Oracle Database to have fine-grained control of disk space use:

Data blocks

At the finest level of granularity, Oracle Database data is stored in data blocks.
One data block corresponds to a specific number of bytes on disk.

Extents

An extent is a specific number of logically contiguous data blocks, obtained in a

single allocation, used to store a specific type of information.

Segments

A segment is a set of extents allocated for a user object (for example, a table or
index), undo data, or temporary data.

Tablespaces

A database is divided into logical storage units called tablespaces. A tablespace is

the logical container for a segment. Each tablespace contains at least one data file.

Redo: In the Oracle RDBMS environment, redo logs comprise files in a proprietary format
which log a history of all changes made to the database. Each redo log file consists of
redo records. A redo record, also called a redo entry, holds a group of change vectors,
each of which describes or represents a change made to a single block in the database.
For example, if a user UPDATEs a salary-value in a table containing employee-related data,
the DBMS generates a redo record containing change-vectors that describe changes to
the data segment block for the table. And if the user then COMMIT the update, Oracle
generates another redo record and assigns the change a "system change number" (SCN).
LGWR writes to redo log files in a circular fashion. When the current redo log file fills,
LGWR begins writing to the next available redo log file. When the last available redo log
file is filled, LGWR returns to the first redo log file and writes to it, starting the cycle
again. The numbers next to each line indicate the sequence in which LGWR writes to each
redo log file.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Reuse of Redo Log Files by LGWR:

Oracle Database uses only one redo log files at a time to store redo records written from
the redo log buffer. The redo log file that LGWR is actively writing to is called
the current redo log file.Redo log files that are required for instance recovery are
called active redo log files. Redo log files that are no longer required for instance recovery
are called inactive redo log files.
A log switch is the point at which the database stops writing to one redo log file and begins
writing to another. Normally, a log switch occurs when the current redo log file is
completely filled and writing must continue to the next redo log file. However, you can
configure log switches to occur at regular intervals, regardless of whether the current redo
log file is completely filled. You can also force log switches manually.
Oracle Database assigns each redo log file a new log sequence number every time a log
switch occurs and LGWR begins writing to it. When the database archives redo log files,
the archived log retains its log sequence number. A redo log file that is cycled back for use
is given the next available log sequence number.

UNDO: Oracle Database creates and manages information that is used to roll back, or
undo, changes to the database. Such information consists of records of the actions of
transactions, primarily before they are committed. These records are collectively referred
to as undo.
Undo records are used to:
Roll back transactions when a ROLLBACK statement is issued
Recover the database
Provide read consistency
Analyze data as of an earlier point in time by using Oracle Flashback Query
Recover from logical corruptions using Oracle Flashback features.

A Snapshot is a recent copy of a table from db or in some cases, a subset of rows/cols of

a table. They are used to dynamically replicate the data between distributed databases.

Snapshot connected to a Single Master Site:

Snapshots can also contain a WHERE clause so that snapshot sites can contain
customized data sets. Such snapshots can be helpful for regional offices or sales forces
that do not require the complete corporate data set.When a snapshot is refreshed, Oracle
must examine all of the changes to the master table to see if any apply to the snapshot.
Therefore, if any changes where made to the master table since the last refresh, a
snapshot refresh will take some time, even if the refresh does not apply any changes to
the snapshot. If, however, no changes at all were made to the master table since the last
refresh of a snapshot, the snapshot refresh should be very quick.

Snapshot and materialized view are almost same same but with one difference.
You can say that materialized view =snapshot + query rewrite functionality query rewrite
functionality:In materialized view you can enable or disable query rewrite option. which
means database server will rewrite the query so as to give high performance. Query
rewrite is based on some rewritten standards(by oracle itself).So the database server will
follow these standards and rewrite the query written in the materialized view ,but this
functionality is not there in snapshots.

Simple snapshots are the only type that can use the FAST REFRESH method. A snapshot
is considered simple if the defining query meets the following criteria:

It does not contain any DISTINCT or aggregation functions.

It does not contain a GROUP BY or CONNECT BY clause.
It does not perform set operations (UNION, UNION ALL, INTERSECT, etc.).
It does not perform joins other than those used for subquery subsetting.
Essentially, a simple snapshot is one that selects from a single table and that may
or may not use a WHERE clause.

Oracle8 extends the universe of simple snapshots with a feature known as subquery
subsetting, described in the later section entitled Subquery Subsetting.

Not surprisingly, any snapshot that is not a simple snapshot is a complex snapshot.
Complex snapshots can only use COMPLETE refreshes, which are not always practical.
For tables of more than about 100,000 rows, COMPLETE refreshes can be quite unwieldy.
You can often avoid this situation by creating simple snapshots of individual tables at the
master site and performing the offending query against the local snapshots.

Oracle Memory Architecture:

Oracle memory architecture is divided in following memory structure:-

1. System Global Area (SGA):- This is a large, shared memory segment that virtually
all Oracle processes will access at one point or another.
2. Process Global Area (PGA): This is memory that is private to a single process
or thread; it is not accessible from other processes/threads.
3. User Global Area (UGA): This is memory associated with your session. It is
located either in the SGA or the PGA, depending whether you are connected to
the database using a shared server (it will be in the SGA), or a dedicated server (it
will be in the PGA).
1)SGA:

There are five memory structures that make up the System Global Area (SGA). The SGA
will store many internal data structures that all processes need access to, cache data from
disk, cache redo data before writing to disk, hold parsed SQL plans and so on.SGA is
used to store database information that is shared by database processes. It contains data
and control information for the Oracle Server and is allocated in the virtual memory if the
computer where Oracle resides.

SGA consists of several memory structures:-

1.Redo Buffer: The redo buffer is where data that needs to be written to the online redo
logs will be cached temporarily, before it is written to disk. Since a memory-to-memory
transfer is much faster than a memory-to-disk transfer, use of the redo log buffer can
speed up database operation. The data will not reside in the redo buffer for very long. In
fact, LGWR initiates a flush of this area in one of the following scenarios:
Every three seconds
Whenever someone commits
When LGWR is asked to switch log files
When the redo buffer gets one-third full or contains 1MB of cached redo log data

Use the parameter LOG_BUFFER parameter to adjust but be-careful increasing it too
large as it will reduce your I/O but commits will take longer.

2.Buffer Cache: The block buffer cache is where Oracle stores database blocks before
writing them to disk and after reading them in from disk. There are three places to store
cached blocks from individual segments in the SGA:
Default pool (hot cache): The location where all segment blocks are normally cached.
Keep pool (warm cache): An alternate buffer pool where by convention you assign
segments that are accessed fairly frequently, but still get aged out of the default buffer pool
due to other segments needing space.
Recycle pool (do not care to cache): An alternate buffer pool where by convention you
assign large segments that you access very randomly, and which would therefore
cause excessive buffer flushing of many blocks from many segments. Theres no benefit to
caching such segments because by the time you wanted the block again, it would have
been aged out of the cache. You would separate these segments out from the segments in
the default and keep pools so they would not cause those blocks to age out of the cache.

The standard block size is determined by the DB_CACHE_SIZE, if tablespaces are

created with a different block sizes then you must also create an entry to match that block
size.

DB_2K_CACHE_SIZE (used with tablespace block size of 2k)

DB_4K_CACHE_SIZE (used with tablespace block size of 4k)
DB_8K_CACHE_SIZE (used with tablespace block size of 8k)
DB_16K_CACHE_SIZE (used with tablespace block size of 16k)
DB_32K_CACHE_SIZE (used with tablespace block size of 32k)

3.Shared Pool: The shared pool is where Oracle caches many bits of program data.
When we parse a query, the parsed representation is cached there. Before we go through
the job of parsing an entire query, Oracle searches the shared pool to see if the work has
already been done. PL/SQL code that you run is cached in the shared pool, so the next
time you run it, Oracle doesnt have to read it in from disk again. PL/SQL code is not only
cached here, it is shared here as well. If you have 1,000 sessions all executing the same
code, only one copy of the code is loaded and shared among all sessions. Oracle stores
the system parameters in the shared pool. The data dictionary
cache (cached information about database objects) is stored here.Dictionary cache is a
collection of database tables and views containing information about the database, its
structures, privileges and users. When statements are issued oracle will check
permissions, access, etc and will obtain this information from its dictionary cache, if the
information is not in the cache then it has to be read in from the disk and placed in to the
cache. The more information held in the cache the less oracle has to access the slow
disks.The parameter SHARED_POOL_SIZE is used to determine the size of the shared
pool, there is no way to adjust the caches independently, you can only adjust the shared
pool size.The shared pool uses a LRU (least recently used) list to maintain what is held in
the buffer, see buffer cache for more details on the LRU.

4.Large Pool: The large pool is not so named because it is a large structure (although it
may very well be large in size). It is so named because it is used for allocations of large
pieces of memory that are bigger than the shared pool is designed to handle. Large
memory allocations tend to get a chunk of memory, use it, and then be done with it. There
was no need to cache this memory as in buffer cache and Shared Pool, hence a new pool
was allocated. So basically Shared pool is more like Keep Pool whereas Large Pool is
similar to the Recycle Pool. Large pool is used specifically by:
Shared server connections, to allocate the UGA region in the SGA.
Parallel execution of statements, to allow for the allocation of interprocess
message buffers, which are used to coordinate the parallel query servers.
Backup for RMAN disk I/O buffers in some cases.

5.Java Pool: The Java pool is used in different ways, depending on the mode in which the
Oracle server is running. In dedicated server mode the total memory required for the
Java pool is quite modest and can be determined based on the number of Java classes
youll be using. In shared server connection the java pool includes shared part of each java
class and Some of the UGA used for per-session state of each session, which is
allocated from the JAVA_POOL within the SGA.

6.Streams Pool: The Streams pool (or up to 10 percent of the shared pool if no Streams
pool is configured) is used to buffer queue messages used by the Streams process as it
moves or copies data from one database to another.

The SGA comprises a number of memory components, which are pools of memory used
to satisfy a particular class of memory allocation requests. Examples of memory
components include the shared pool (used to allocate memory for SQL and PL/SQL
execution), the java pool (used for java objects and other java execution memory), and the
buffer cache (used for caching disk blocks). All SGA components allocate and deallocate
space in units of granules. Oracle Database tracks SGA memory use in internal numbers
of granules for each SGA component.Granule size is determined by total SGA size. On
most platforms, the size of a granule is 4 MB if the total SGA size is less than 1 GB, and
granule size is 16MB for larger SGAs. Some platform dependencies arise. For example,
on 32-bit Windows, the granule size is 8 M for SGAs larger than 1 GB.Oracle Database
can set limits on how much virtual memory the database uses for the SGA. It can start
instances with minimal memory and allow the instance to use more memory by expanding
the memory allocated for SGA components, up to a maximum determined by
the SGA_MAX_SIZEinitialization parameter. If the value for SGA_MAX_SIZE in the initialization
parameter file or server parameter file (SPFILE) is less than the sum the memory allocated
for all components, either explicitly in the parameter file or by default, at the time the
instance is initialized, then the database ignores the setting for SGA_MAX_SIZE.

2)PGA:

PGA is the memory reserved for each user process connecting to an Oracle Database and
is allocated when a process is created and deallocated when a process is terminated.
Contents of PGA:-

Private SQL Area: Contains data such as bind information and run-time memory
structures. It contains Persistent Area which contains bind information and is freed
only when the cursor is closed and Run time Area which is created as the first step
of an execute request. This area is freed only when the statement has been
executed. The number of Private SQL areas that can be allocated to a user process
depends on the OPEN_CURSORS initialization parameter.
Session Memory: Consists of memory allocated to hold a sessions variable and
other info related to the session.
SQL Work Areas: Used for memory intensive operations such as: Sort, Hash-join,
Bitmap merge, Bitmap Create.

Automatic PGA Memory Management

Before Auto-Memory management DBA had to allocate memory to:-

SORT_AREA_SIZE: The total amount of RAM that will be used to sort

information before swapping out to disk.
SORT_AREA_RETAINED_SIZE: The amount of memory that will be used to hold
sorted data after the sort is complete.
HASH_AREA_SIZE: The amount of memory your server process can use to store
hash tables in memory. These structures are used during a hash join, typically
when joining a large set with another set. The smaller of the two sets would be
hashed into memory and anything that didnt fit in the hash area region of memory
would be stored in the temporary tablespace by the join key.

To enable PGA Auto-Mem Management enable the parameter

WORKAREA_SIZE_POLICY and allocate total memory to be used for this purpose to
PGA_AGGREGATE_TARGET.

NOTE:- From 11gR1 You can set MEMORY_TARGET and auto-mem management for
both SGA and PGA is taken care.

I came across several DBAs enquiring about how the PGA Memory is allocated and from
their I cam to know about several misconceptions people are having so writing a short note
on the same.

The PGA_AGGREGATE_TARGET is a goal of an upper limit. It is not a value that is

preallocated when the database is started up. You can observe this by setting
the PGA_AGGREGATE_TARGET to a value much higher than the amount of
physical memory you have available on your server. You will not see any large allocation
of memory as a result. A serial (nonparallel query) session will use a small percentage of
the PGA_AGGREGATE_TARGET, typically about 5 percent or less. Hence its not that all
of the memory allocated to PGA is granted at the time DB is started and gradually
increases with number of user processes. The algorithm that I am aware of, allocates 5%
of PGA to the user process until there is crunch on the PGA and then modifies the
allocation based on the usage requirement of the user process.
Staring with Oracle 9i there is a new to manage the above settings that is to let oracle
manage the PGA area automatically by setting the parameter following parameters Oracle
will automatically adjust the PGA area basic on users demand.

workarea_size_policy - you can set this option to manual or auto (default)

pga_aggregate_target - controls how much to allocate the PGA in total

Oracle will try and keep the PGA under the target value, but if you exceed this value
Oracle will perform multi-pass operations (disk operations).

Memory Area Dedicated Server Shared Server

Nature of Session Memory Private Shared
Location of Persistent Area PGA SGA
Location of the part of the runtime PGA PGA
area for SELECT statements
Location for the run time area for PGA PGA
DDL/DML statements

3)UGA:

The UGA (User Global Area) is your state information, this area of memory will be
accessed by your current session, depending on the connection type (shared server) the
UGA can be located in the SGA which is accessible by any one of the shared server
processes, because a dedicated connection does not use shared servers the memory will
be located in the PGA
Shared server - UGA will be part of the SGA
Dedicated server - UGA will be the PGA

CURSOR: A cursor is a temporary work area created in the system memory when a SQL
statement is executed. A cursor contains information on a select statement and the rows of
data accessed by it.This temporary work area is used to store the data retrieved from the
database, and manipulate this data. A cursor can hold more than one row, but can process
only one row at a time. The set of rows the cursor holds is called the active set.

Two Types of Cursor :

1)Implicit Cursor

Implicit cursors are automatically created by Oracle whenever an SQL statement is

executed, when there is no explicit cursor for the statement. Programmers cannot control
the implicit cursors and the information in it.Whenever a DML statement (INSERT,
UPDATE and DELETE) is issued, an implicit cursor is associated with this statement. For
INSERT operations, the cursor holds the data that needs to be inserted. For UPDATE and
DELETE operations, the cursor identifies the rows that would be affected.

2)Explicit Cursor

They must be created when you are executing a SELECT statement that returns more
than one row. Even though the cursor stores multiple records, only one record can be
processed at a time, which is called as current row. When you fetch a row the current row
position moves to next row.

For Example: When you execute INSERT, UPDATE, or DELETE statements the cursor
attributes tell us whether any rows are affected and how many have been affected. When
a SELECT... INTO statement is executed in a PL/SQL Block, implicit cursor attributes can
be used to find out whether any row has been returned by the SELECT statement. PL/SQL
returns an error when no data is selected.

In PL/SQL, you can refer to the most recent implicit cursor as the SQL cursor, which
always has the attributes like %FOUND, %ISOPEN, %NOTFOUND, and %ROWCOUNT.
The SQL cursor has additional attributes, %BULK_ROWCOUNT and
%BULK_EXCEPTIONS, designed for use with the FORALL statement.

TRIGGER: Triggers are stored programs, which are automatically executed or fired when
some events occur.Trigger automatically associated with DML statement, when DML
statement execute trigger implicitly execute.You can create trigger using the CREATE
TRIGGER statement. If trigger activated, implicitly fire DML statement and if trigger
deactivated can't fire.

Triggers could be defined on the table, view, schema, or database with which the event is
associated.

Advantages of trigger:

1) Triggers can be used as an alternative method for implementing referential integrity

constraints.

2) By using triggers, business rules and transactions are easy to store in database and
can be used consistently even if there are future updates to the database.

3) It controls on which updates are allowed in a database.

4) When a change happens in a database a trigger can adjust the change to the
entire database.

5) Triggers are used for calling stored procedures.

Use the CREATE TRIGGER statement to create and enable a database trigger, which is:

A stored PL/SQL block associated with a table, a schema, or the database or

An anonymous PL/SQL block or a call to a procedure implemented in PL/SQL or
Java
Oracle Database automatically executes a trigger when specified conditions occur.When
you create a trigger, the database enables it automatically. You can subsequently disable
and enable a trigger with the DISABLE and ENABLE clause of
the ALTER TRIGGER or ALTER TABLE statement.

Before a trigger can be created, the user SYS must run a SQL script commonly
called DBMSSTDX.SQL. The exact name and location of this script depend on your operating
system.

To create a trigger in your own schema on a table in your own schema or on your
own schema (SCHEMA), you must have the CREATE TRIGGERsystem privilege.
To create a trigger in any schema on a table in any schema, or on another user's
schema (schema.SCHEMA), you must have the CREATE ANYTRIGGER system privilege.
In addition to the preceding privileges, to create a trigger on DATABASE, you must
have the ADMINISTER DATABASE TRIGGER system privilege.

If the trigger issues SQL statements or calls procedures or functions, then the owner of the
trigger must have the privileges necessary to perform these operations. These privileges
must be granted directly to the owner rather than acquired through roles.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Data Blocks
At the finest level of granularity, Oracle stores data in data blocks (also called logical
blocks, Oracle blocks, or pages). One data block corresponds to a specific number of
bytes of physical database space on disk. You set the data block size for every Oracle
database when you create the database. This data block size should be a multiple of the
operating system's block size within the maximum limit. Oracle data blocks are the
smallest units of storage that Oracle can use or allocate.In contrast, all data at the
physical, operating system level is stored in bytes. Each operating system has what is
called a block size. Oracle requests data in multiples of Oracle blocks, not operating
system blocks. Therefore, you should set the Oracle block size to a multiple of the
operating system block size to avoid unnecessary I/O.

Extents
The next level of logical database space is called an extent. An extent is a specific number
of contiguous data blocks that is allocated for storing a specific type of information.

Segments
The level of logical database storage above an extent is called a segment. A segment is a
set of extents that have been allocated for a specific type of data structure, and that all are
stored in the same tablespace. For example, each table's data is stored in its own data
segment, while each index's data is stored in its own index segment.Oracle allocates
space for segments in extents. Therefore, when the existing extents of a segment are full,
Oracle allocates another extent for that segment. Because extents are allocated as
needed, the extents of a segment may or may not be contiguous on disk. The segments
also can span files, but the individual extents cannot.

There are four types of segments used in Oracle databases:

- data segments
- index segments
- rollback segments
- temporary segments

Data Segments:
There is a single data segment to hold all the data of every non clustered table in an oracle
database. This data segment is created when you create an object with the CREATE
TABLE/SNAPSHOT/SNAPSHOT LOG command. Also, a data segment is created for a
cluster when a CREATE CLUSTER command is issued.
The storage parameters control the way that its data segment's extents are allocated.
These affect the efficiency of data retrieval and storage for the data segment associated
with the object.

Index Segments:
Every index in an Oracle database has a single index segment to hold all of its data.
Oracle creates the index segment for the index when you issue the CREATE INDEX
command. Setting the storage parameters directly affects the efficiency of data retrieval
and storage.

Rollback Segments
Rollbacks are required when the transactions that affect the database need to be undone.
Rollbacks are also needed during the time of system failures. The way the roll-backed data
is saved in rollback segment, the data can also be redone which is held in redo segment.

A rollback segment is a portion of the database that records the actions of transactions if
the transaction should be rolled back. Each database contains one or more rollback
segments. Rollback segments are used to provide read consistency, to rollback
transactions, and to recover the database.

Types of rollbacks:
- statement level rollback
- rollback to a savepoint
- rollback of a transaction due to user request
- rollback of a transaction due to abnormal process termination
- rollback of all outstanding transactions when an instance terminates abnormally
- rollback of incomplete transactions during recovery.

Temporary Segments:
The SELECT statements need a temporary storage. When queries are fired, oracle needs
area to do sorting and other operation due to which temporary storages are useful.

The commands that may use temporary storage when used with SELECT are:
GROUP BY, UNION, DISTINCT, etc.

Oracle Trigger
Oracle allows you to define procedures that are implicitly executed when an
INSERT, UPDATE, or DELETE statement is issued against the associated table. These
procedures are called database triggers.

Triggers are commonly used to

automatically generate derived column values

prevent invalid transactions

enforce complex security authorizations

enforce referential integrity across nodes in a distributed database

enforce complex business rules

provide transparent event logging

provide sophisticated auditing

maintain synchronous table replicates

gather statistics on table access

Oracle Cursor

A cursor is a pointer to this context area. PL/SQL controls the context area through a cursor.
A cursor holds the rows (one or more) returned by a SQL statement. The set of rows the
cursor holds is referred to as the active set.

You can name a cursor so that it could be referred to in a program to fetch and process the
rows returned by the SQL statement, one at a time. There are two types of cursors

Implicit cursors

Explicit cursors

Implicit Cursors
Implicit cursors are automatically created by Oracle whenever an SQL statement is executed,
when there is no explicit cursor for the statement. Programmers cannot control the implicit
cursors and the information in it.

Whenever a DML statement (INSERT, UPDATE and DELETE) is issued, an implicit cursor is
associated with this statement. For INSERT operations, the cursor holds the data that needs
to be inserted. For UPDATE and DELETE operations, the cursor identifies the rows that would
be affected.

In PL/SQL, you can refer to the most recent implicit cursor as the SQL cursor, which always
has attributes such as %FOUND, %ISOPEN, %NOTFOUND, and %ROWCOUNT. The SQL cursor
has additional attributes, %BULK_ROWCOUNT and %BULK_EXCEPTIONS, designed for use
with the FORALL statement.

Explicit Cursors
Explicit cursors are programmer-defined cursors for gaining more control over the context
area. An explicit cursor should be defined in the declaration section of the PL/SQL Block. It is
created on a SELECT Statement which returns more than one row.

Exception Handling

An error occurs during the program execution is called Exception in PL/SQL.

PL/SQL facilitates programmers to catch such conditions using exception block in the
program and an appropriate action is taken against the error condition.

There are two type of exceptions:

o System-defined Exceptions

o User-defined Exceptions

PL/SQL Exception Handling

Syntax for exception handling:

Following is a general syntax for exception handling:

1. DECLARE
2. <declarations section>
3. BEGIN
4. <executable command(s)>
5. EXCEPTION
6. <exception handling goes here >
7. WHEN exception1 THEN
8. exception1-handling-statements
9. WHEN exception2 THEN
10. exception2-handling-statements
11. WHEN exception3 THEN
12. exception3-handling-statements
13. ........
14. WHEN others THEN
15. exception3-handling-statements
16. END;
PL/SQL catches and handles exceptions by using exception handler architecture.
Whenever an exception occurs, it is raised. The current PL/SQL block execution halts and
control is passed to a separate section called exception section. In the exception section,
you can check what kind of exception has been occurred and handle it appropriately. This
exception handler architecture enables separating the business logic and exception
handling code hence make the program easier to read and maintain.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Computer Networks
Data Communication: When we communicate, when we share information. It can be
local or remote.Local communication occur face to face while remote communication take
place over distance.Data communication are the exchange of data between two devices
via the some form of transmission medium such as wire cable.

Characteristics Of Data Communication

The Data communication must have major three fundamental characteristics:

Delivery
Accuracy
Time Line

1) Where Delivery means system must delivered data to correct destination. Data must be
received by the intended device.

2) Accuracy mean data delivered in accurately. Means that data should not be altered
during transmission.

3) Time line means data should be delivered in time. When data in form of video audio is
transfer as they produced at same time to other location is called real time transition.

Types Of Data Communication

There are two types of data communication

Serial communication
Parallel communication

Serial communication
In telecommunication and computer science, serial communication is the process of
sending data one bit at one time, sequentially on a single wire, over a communication
channel or computer bus. Serial is a common communication protocol that is used by
many devices. Serial communication has become the standard for intercomputer
communication. Serial communication is used for all long-haul communication and most
computer networks its save the costs of cable. Serial communication is a popular means of
transmitting data between a computer and a peripheral device such as a programmable
instrument or even another computer. its also easy to established and no extra devices are
used because most of computers have one or more serial ports.Examples isR-
232,Universal Serial Bus,R-423,PCI Express.
Parallel communication
Parallel communication is fast method of communication. in Parallel transmission transmit
the data across a parallel wire. These Parallel wires are flat constituting multiple, smaller
cables. Each cable can carry a single bit of information . A parallel cable can carry group
of data at the same time. In telecommunication and computer science, parallel
communication is a method of sending several data signals over a communication link at
one time. Examples is Industry Standard Architecture(ISA),Parallel ATA,IEEE
1284,Conventional PCI.

For synchronous data transfer, both sender and receiver access the data
according to the same clock. Therefore, a special line for the clock signal is
required. A master(or one of the senders) should provide clock signal to all the
receivers in synchronous data transfer mode. Synchronous data transfer supports
very high data transfer rate.
For asynchronous data transfer, there is no common clock signal between the
senders and receivers. Therefore, the sender and the receiver first need to agree
on a data transfer speed. This speed usually does not change after data transfer
starts. The data transfer rate is slow in asynchronous data transfer..

Data Flow Communication between two devices can be simplex, half-duplex, or full-
duplex:
In simplex mode, the communication is unidirectional, as on a one-way street. Only one
of the two devices on a link can transmit; the other can only receive. Keyboards and
traditional monitors are examples of simplex devices. The keyboard can only introduce
input; the monitor can only accept output. The simplex mode can use the entire capacity of
the channel to send data in one direction.

In half-duplex mode, each station can both transmit and receive, but not at the same
time. : When one device is sending, the other can only receive, and vice versa . The half-
duplex mode is like a one-lane road with traffic allowed in both directions. When cars are
traveling in one direction, cars going the other way must wait. Walkie-talkies and CB
(citizens band) radios are both half-duplex systems. The half-duplex mode is used in
cases where there is no need for communication in both directions at the same time; the
entire capacity of the channel can be utilized for each direction.

In full-duplex mode,data transmission means that data can be transmitted in both

directions on a signal carrier at the same time. For example, on a local area network with a
technology that has full-duplex transmission, one workstation can be sending data on the
line while another workstation is receiving data. Full-duplex transmission necessarily
implies a bidirectional line (one that can move data in both directions).

Network: A network is a set of devices (often referred to as nodes) connected by

communication links. A node can be a computer, printer, or any other device capable of
sending and/or receiving data generated by other nodes on the network.

Type of Connection: A network is two or more devices connected through links. A link is
a communications pathway that transfers data from one device to another. For
visualization purposes, it is simplest to imagine any link as a line drawn between two
points. For communication to occur, two devices must be connected in some way to the
same link at the same time.

There are two possible types of connections:

a)Point-to-Point

b)Multipoint.

a)Point-to-Point: A point-to-point connection provides a dedicated link between two

devices. The entire capacity of the link is reserved for transmission between those two
devices. Most point-to-point connections use an actual length of wire or cable to connect
the two ends, but other options, such as microwave or satellite links. When you change
television channels by infrared remote control, you are establishing a point-to-point
connection between the remote control and the television's control system.

b)Multipoint: A multipoint (also called multi drop) connection is one in which more than
two specific devices share a single link. In a multipoint environment, the capacity of the
channel is shared, either spatially or temporally. If several devices can use the link
simultaneously, it is a spatially shared connection. If users must take turns, it is a
timeshared connection.

Network Topology: Network topology is the arrangement of the various elements of a

computer or biological network. Essentially it is the topological structure of a network, and
may be depicted physically or logically. Physical topology refers to the placement of the
network's various components, inducing device location and cable installation, while logical
topology shows how data flows within a network, regardless of its physical design.

Devices on the network are referred to as 'nodes.' The most common nodes are
computers and peripheral devices. Network topology is illustrated by showing these nodes
and their connections using cables.

Factors to be taken into consideration while choosing a Network

topology:
1) Scale of your project (in terms of number of components to be connected).
2) Amount of traffic expected on the network.
3) Budget allotted for the network i.e. amount of money you are willing to invest.
4) Required response time

Types of Network Topology:

1)Bus Topology

2)Ring Topology

3)Star Topology

4)Mesh Topology

5)Tree Topology
1)Bus Topology: In networking a bus is the central cable -- the main wire -- that connects
all devices on a local-area network (LAN). It is also called the backbone. This is often used
to describe the main network connections composing the Internet. Bus networks are
relatively inexpensive and easy to install for small networks. Ethernet systems use a bus
topology.A signal from the source is broadcasted and it travels to all workstations
connected to bus cable. Although the message is broadcasted but only the intended
recipient, whose MAC address or IP address matches, accepts it. If the MAC /IP address
of machine doesnt match with the intended address, machine discards the signal. A
terminator is added at ends of the central cable, to prevent bouncing of signals. A barrel
connector can be used to extend it.

ADVANTAGES OF BUS TOPOLOGY

1. It is cost effective.
2. Cable required is least compared to other network topology.
3. Used in small networks.
4. It is easy to understand.
5. Easy to expand joining two cables together.

DISADVANTAGES OF BUS TOPOLOGY

1. Cables fails then whole network fails.

2. If network traffic is heavy or nodes are more the performance of the network
decreases.
3. Cable has a limited length.
4. It is slower than the ring topology.

2)Ring Topology: All the nodes are connected to each-other in such a way that they
make a closed loop. Each workstation is connected to two other components on either
side, and it communicates with these two adjacent neighbors. Data travels around the
network, in one direction. Sending and receiving of data takes place by the help of
TOKEN.

Token Passing: Token contains a piece of information which along with data is sent by
the source computer. This token then passes to next node, which checks if the signal is
intended to it. If yes, it receives it and passes the empty to into the network, otherwise
passes token along with the data to next node. This process continues until the signal
reaches its intended destination.
The nodes with token are the ones only allowed to send data. Other nodes have to wait for
an empty token to reach them. This network is usually found in offices, schools and small
buildings.

ADVANTAGES OF RING TOPOLOGY

1. Transmitting network is not affected by high traffic or by adding more nodes, as

only the nodes having tokens can transmit data.
2. Cheap to install and expand

DISADVANTAGES OF RING TOPOLOGY

1. Troubleshooting is difficult in ring topology.

2. Adding or deleting the computers disturbs the network activity.
3. Failure of one computer disturbs the whole network.

3)Star Topology: In a star network devices are connected to a central computer, called a
hub. Nodes communicate across the network by passing data through the hub.
Advantages of Star Topology
1) As compared to Bus topology it gives far much better performance, signals dont
necessarily get transmitted to all the workstations. A sent signal reaches the intended
destination after passing through no more than 3-4 devices and 2-3 links. Performance of
the network is dependent on the capacity of central hub.
2) Easy to connect new nodes or devices. In star topology new nodes can be added
easily without affecting rest of the network. Similarly components can also be removed
easily.
3) Centralized management. It helps in monitoring the network.
4) Failure of one node or link doesnt affect the rest of network. At the same time its easy
to detect the failure and troubleshoot it.

Disadvantages of Star Topology

1) Too much dependency on central device has its own drawbacks. If it fails whole
network goes down.
2) The use of hub, a router or a switch as central device increases the overall cost of the
network.
3) Performance and as well number of nodes which can be added in such topology is
depended on capacity of central device.

4)Mesh Topology:In a mesh network, devices are connected with many redundant
interconnections between network nodes. In a true mesh topology every node has a
connection to every other node in the network.

There are two types of mesh topologies:

Full mesh topology:occurs when every node has a circuit connecting it to every other
node in a network. Full mesh is very expensive to implement but yields the greatest
amount of redundancy, so in the event that one of those nodes fails, network traffic can be
directed to any of the other nodes. Full mesh is usually reserved for backbone networks.

Partial mesh topology: is less expensive to implement and yields less redundancy than
full mesh topology. With partial mesh, some nodes are organized in a full mesh scheme
but others are only connected to one or two in the network. Partial mesh topology is
commonly found in peripheral networks connected to a full meshed backbone.
ADVANTAGES OF MESH TOPOLOGY

1. Each connection can carry its own data load.

2. It is robust.
3. Fault is diagnosed easily.
4. Provides security and privacy.

DISADVANTAGES OF MESH TOPOLOGY

1. Installation and configuration is difficult.

2. Cabling cost is more.
3. Bulk wiring is required.

5)Tree Topology: Tree Topology integrates the characteristics of Star and Bus Topology.
Earlier we saw how in Physical Star network Topology, computers (nodes) are connected
by each other through central hub. And we also saw in Bus Topology, work station devices
are connected by the common cable called Bus. After understanding these two network
configurations, we can understand tree topology better. In Tree Topology, the number of
Star networks are connected using Bus. This main cable seems like a main stem of a tree,
and other star networks as the branches. It is also called Expanded Star Topology.
ADVANTAGES OF TREE TOPOLOGY

1. Extension of bus and star topologies.

2. Expansion of nodes is possible and easy.
3. Easily managed and maintained.
4. Error detection is easily done.

DISADVANTAGES OF TREE TOPOLOGY

1. Heavily cabled.
2. Costly.
3. If more nodes are added maintenance is difficult.
4. Central hub fails, network fails.

6)Hybrid Topology: A hybrid topology is a type of network topology that uses two or more
other network topologies, including bus topology, mesh topology, ring topology, star
topology, and tree topology.
Hybrid network topology has many advantages. Hybrid topologies are flexible, reliable,
have increased fault tolerance. The new nodes can be easily added to the hybrid network,
the network faults can be easily diagnosed and corrected without affecting the work of the
rest of network. But at the same time hybrid topologies are expensive and difficult for
managing.

Types of Network:

1)LAN: A LAN connects network devices over a relatively short distance. A networked
office building, school, or home usually contains a single LAN, though sometimes one
building will contain a few small LANs (perhaps one per room), and occasionally a LAN will
span a group of nearby buildings. In TCP/IP networking, a LAN is often but not always
implemented as a single IP subnet.A LAN typically relies mostly on wired connections for
increased speed and security, but wireless connections can also be part of a LAN. High
speed and relatively low cost are the defining characteristics of LANs.the maximum span
of 10 km.
2)WAN: A wide area network, or WAN, occupies a very large area, such as an entire
country or the entire world. A WAN can contain multiple smaller networks, such as LANs
or MANs. The Internet is the best-known example of a public WAN.

3)MAN: A metropolitan area network (MAN) is a hybrid between a LAN and a WAN. Like a
WAN, it connects two or more LANs in the same geographic area. A MAN, for example,
might connect two different buildings or offices in the same city. However, whereas WANs
typically provide low- to medium-speed access, MAN provide high-speed connections,
such as T1 (1.544Mbps) and optical services.
The optical services provided include SONET (the Synchronous Optical Network standard)
and SDH (the Synchronous Digital Hierarchy standard). With these optical services,
carriers can provide high-speed services, including ATM and Gigabit Ethernet. These two
optical services provide speeds ranging into the hundreds or thousands of megabits per
second (Mbps). Devices used to provide connections for MANs include high-end routers,
ATM switches, and optical switches.

4)PAN: A Personal Area Network (PAN) is a computer network used

for communication among computer devices, including telephones and personal digital
assistants, in proximity to an individual's body. The devices may or may not belong to the
person in question. The reach of a PAN is typically a few meters. PANs can be used for
communication among the personal devices themselves (intrapersonal communication), or
for connecting to a higher level network and the Internet .

5)Campus Area Network: This is a network which is larger than a LAN, but smaller than
an MAN. This is typical in areas such as a university, large school or small business. It is
typically spread over a collection of buildings which are reasonably local to each other. It
may have an internal Ethernet as well as capability of connecting to the internet.

6)Storage Area Network: This network connects servers directly to devices which store
amounts of data without relying on a LAN or WAN network to do so. This can involve
another type of connection known as Fibre Channel, a system similar to Ethernet which
handles high-performance disk storage for applications on a number of professional
networks.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
OSI (Open Systems Interconnection) is reference model for how applications can
communicate over a network. A reference model is a conceptual framework for
understanding relationships. The purpose of the OSI reference model is to guide vendors
and developers so the digital communication products and software programs they create
will interoperate, and to facilitate clear comparisons among communications tools. Most
vendors involved in telecommunications make an attempt to describe their products and
services in relation to the OSI model. And although useful for guiding discussion and
evaluation, OSI is rarely actually implemented, as few network products or standard tools
keep all related functions together in well-defined layers as related to the model. The
TCP/IP protocols, which define the Internet, do not map cleanly to the OSI model.

OSI layers
The main concept of OSI is that the process of communication between two endpoints in a
telecommunication network can be divided into seven distinct groups of related functions,
or layers. Each communicating user or program is at a computer that can provide those
seven layers of function. So in a given message between users, there will be a flow of data
down through the layers in the source computer, across the network and then up through
the layers in the receiving computer. The seven layers of function are provided by a
combination of applications, operating systems, network card device drivers and
networking hardware that enable a system to put a signal on a network cable or out
over Wi-Fi or other wireless protocol).

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
The seven Open Systems Interconnection layers are:
LAYER 1: THE PHYSICAL LAYER :

1. It is the lowest layer of the OSI Model.

2. It activates, maintains and deactivates the physical connection.
3. It is responsible for transmission and reception of the unstructured raw data over
network.
4. Voltages and data rates needed for transmission is defined in the physical layer.
5. It converts the digital/analog bits into electrical signal or optical signals.
6. Data encoding is also done in this layer.

LAYER 2: DATA LINK LAYER :

1. Data link layer synchronizes the information which is to be transmitted over the
physical layer.
2. The main function of this layer is to make sure data transfer is error free from one
node to another, over the physical layer.
3. Transmitting and receiving data frames sequentially is managed by this layer.
4. This layer sends and expects acknowledgements for frames received and sent
respectively. Resending of non-acknowledgement received frames is also handled
by this layer.
5. This layer establishes a logical layer between two nodes and also manages the
Frame traffic control over the network. It signals the transmitting node to stop, when
the frame buffers are full.

LAYER 3: THE NETWORK LAYER :

1. It routes the signal through different channels from one node to other.
2. It acts as a network controller. It manages the Subnet traffic.
3. It decides by which route data should take.
4. It divides the outgoing messages into packets and assembles the incoming packets
into messages for higher levels.

LAYER 4: TRANSPORT LAYER :

1. It decides if data transmission should be on parallel path or single path.

2. Functions such as Multiplexing, Segmenting or Splitting on the data are done by
this layer
3. It receives messages from the Session layer above it, convert the message into
smaller units and passes it on to the Network layer.
4. Transport layer can be very complex, depending upon the network requirements.

Transport layer breaks the message (data) into small units so that they are handled more
efficiently by the network layer.
LAYER 5: THE SESSION LAYER :

1. Session layer manages and synchronize the conversation between two different
applications.
2. Transfer of data from source to destination session layer streams of data are
marked and are re-synchronized properly, so that the ends of the messages are not
cut prematurely and data loss is avoided.

LAYER 6: THE PRESENTATION LAYER :

1. Presentation layer takes care that the data is sent in such a way that the receiver
will understand the information (data) and will be able to use the data.
2. While receiving the data, presentation layer transforms the data to be ready for the
application layer.
3. Languages(syntax) can be different of the two communicating systems. Under this
condition presentation layer plays a role of translator.
4. It performs Data compression, Data encryption, Data conversion etc.

LAYER 7: APPLICATION LAYER :

1. It is the topmost layer.

2. Transferring of files disturbing the results to the user is also done in this layer. Mail
services, directory services, network resource etc are services provided by
application layer.
3. This layer mainly holds application programs to act upon the received and to be
sent data.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
MERITS OF OSI REFERENCE MODEL:

1. OSI model distinguishes well between the services, interfaces and protocols.
2. Protocols of OSI model are very well hidden.
3. Protocols can be replaced by new protocols as technology changes.
4. Supports connection oriented services as well as connectionless service.

DEMERITS OF OSI REFERENCE MODEL:

1. Model was devised before the invention of protocols.

2. Fitting of protocols is tedious task.
3. It is just used as a reference model.

Hardware/Networking Devices: Networking hardware may also be known as network

equipment computer networking devices.

Network Interface Card (NIC): NIC provides a physical connection between the
networking cable and the computer's internal bus. NICs come in three basic varieties 8 bit,
16 bit and 32 bit. The larger number of bits that can be transferred to NIC, the faster the
NIC can transfer data to network cable.

Repeater: Repeaters are used to connect together two Ethernet segments of any media
type. In larger designs, signal quality begins to deteriorate as segments exceed their
maximum length. We also know that signal transmission is always attached with energy
loss. So, a periodic refreshing of the signals is required.
Hubs: Hubs are actually multi part repeaters. A hub takes any incoming signal and
repeats it out all ports.

Bridges: When the size of the LAN is difficult to manage, it is necessary to break up the
network. The function of the bridge is to connect separate networks together. Bridges do
not forward bad or misaligned packets.

Switch: Switches are an expansion of the concept of bridging. Cut through switches
examine the packet destination address, only before forwarding it onto its destination
segment, while a store and forward switch accepts and analyzes the entire packet before
forwarding it to its destination. It takes more time to examine the entire packet, but it allows
catching certain packet errors and keeping them from propagating through the network.
Routers: Router forwards packets from one LAN (or WAN) network to another. It is also
used at the edges of the networks to connect to the Internet.

Gateway: Gateway acts like an entrance between two different networks. Gateway in
organisations is the computer that routes the traffic from a work station to the outside
network that is serving web pages. ISP (Internet Service Provider) is the gateway for
Internet service at homes.

ARP: Address Resolution Protocol (ARP) is a protocol for mapping an Internet Protocol
address (IP address) to a physical machine address that is recognized in the local
network. For example, in IP Version 4, the most common level of IP in use today, an
address is 32 bits long. In an Ethernet local area network, however, addresses for
attached devices are 48 bits long. (The physical machine address is also known as a
Media Access Control or MAC address.) A table, usually called the ARP cache, is used to
maintain a correlation between each MAC address and its corresponding IP address. ARP
provides the protocol rules for making this correlation and providing address conversion in
both directions.

There are four types of arp messages that may be sent by the arp protocol. These are
identified by four values in the "operation" field of an arp message. The types of message
are:
1)ARP request
2)ARP reply
3)RARP request
4)RARP reply

Frame Relay:

Frame Relay is a standardized wide area network technology that operates at the physical
and logical link layers of OSI model. Frame relay originally designed for transport across
Integrated Services Digital Network (ISDN) infrastructure, it may be used today in the
context of many other network interfaces.
Frame relay is an example of a packet switched technology. Packet switched network
enables end stations to dynamically share the network medium and the available
bandwidth.
Frame Relay is often described as a streamlined version of X.25, it is because frame relay
typically operates over WAN facilities that offer more reliable connection services. Frame
relay is strictly a layer 2 protocol suite, where as X.25 provides services at layer 3.

Some important characteristics of frame relay are,

It allows bursty data.

It allows the frame size 9000 bytes, which can accumulate all LANs.
It is less expensive than other traditional WANs.
It has error detection only at data link layer, there is no any flow control and error
control.
There is also a retransmission policy if frame is damaged.
56 kbps, 64 kbps, 128 kbps, 256 kbps, 512 kbps and 1.5 Mbps.

For most services, the network provides a permanent virtual circuit (PVC), which means
that the customer sees a continuous, dedicated connection without having to pay for a full-
time leased line, while the service provider figures out the route each frame travels to its
destination and can charge based on usage. Switched virtual circuits (SVC), by contrast,
are temporary connections that are destroyed after a specific data transfer is completed.In
order for a frame relay WAN to transmit data, data terminal equipment (DTE) and data
circuit-terminating equipment (DCE) are required. DTEs are typically located on the
customer's premises and can encompass terminals, routers, bridges and personal
computers. DCEs are managed by the carriers and provide switching and associated
services.

Frame Relay Virtual Circuits:

Frame Relay provides connection-oriented data link layer communications. This means
that a defined communication exists between each pair of devices and that these
connections are associated with a connection identifier (ID). This service is implemented
by using a FR virtual circuit, which is a logical connection created between two DTE
devices across a Frame Relay packet-switched network (PSN).Virtual circuits provide a
bidirectional communication path from one DTE device to another and are uniquely
identified by a data-link connection identifier (DLCI). A virtual circuit can pass through any
number of intermediate DCE devices (switches) located within the Frame Relay PSN.

Frame Relay virtual circuits fall into two categories: switched virtual circuits (SVCs) and
permanent virtual circuits (PVCs).

Switched Virtual Circuits (SVCs)

Switched virtual circuits (SVCs) are temporary connections used in situations requiring
only sporadic data transfer between DTE devices across the Frame Relay network. A
communication session across an SVC consists of the following four operational states:

Call setupThe virtual circuit between two Frame Relay DTE devices is established.

Data transferData is transmitted between the DTE devices over the virtual circuit.

IdleThe connection between DTE devices is still active, but no data is transferred. If an
SVC remains in an idle state for a defined period of time, the call can be terminated.

Call terminationThe virtual circuit between DTE devices is terminated.

Permanent Virtual Circuits (PVCs)

Permanent virtual circuits (PVCs) are permanently established connections that are used
for frequent and consistent data transfers between DTE devices across the Frame Relay
network. Communication across a PVC does not require the call setup and termination
states that are used with SVCs. PVCs always operate in one of the following two
operational states:

Data transferData is transmitted between the DTE devices over the virtual circuit.

IdleThe connection between DTE devices is active, but no data is transferred. Unlike
SVCs, PVCs will not be terminated under any circumstances when in an idle state.

DTE devices can begin transferring data whenever they are ready because the circuit is
permanently established.

X.25:

X.25 Packet Switched networks allow remote devices to communicate with each other
over private digital links without the expense of individual leased lines. Packet Switching is
a technique whereby the network routes individual packets of HDLC data between
different destinations based on addressing within each packet. An X.25 network consists
of a network of interconnected nodes to which user equipment can connect. The user end
of the network is known as Data Terminal Equipment (DTE) and the carriers equipment
is Data Circuit-terminating Equipment (DCE) . X.25 routes packets across the network
from DTE to DTE.

The X.25 standard corresponds in functionality to the first three layers of the Open
Systems Interconnection (OSI) reference model for networking. Specifically, X.25 defines
the following:

The physical layer interface for connecting data terminal equipment (DTE), such as
computers and terminals at the customer premises, with the data communications
equipment (DCE), such as X.25 packet switches at the X.25 carriers facilities. The
physical layer interface of X.25 is called X.21bis and was derived from the RS-232
interface for serial transmission.
The data-link layer protocol called Link Access Procedure, Balanced (LAPB), which
defines encapsulation (framing) and error-correction methods. LAPB also enables
the DTE or the DCE to initiate or terminate a communication session or initiate data
transfer. LAPB is derived from the High-level Data Link Control (HDLC) protocol.
The network layer protocol called the Packet Layer Protocol (PLP), which defines
how to address and deliver X.25 packets between end nodes and switches on an
X.25 network using permanent virtual circuits (PVCs) or switched virtual circuits
(SVCs). This layer is responsible for call setup and termination and for managing
transfer of packets.
IP address is short for Internet Protocol (IP) address. An IP address an identifier for
a computer or device on a TCP/IP network. Networks using the
TCP/IP protocol route messages based on the IP address of the destination.
Contrast with IP, which specifies the format of packets also called datagrams, and
the addressing scheme.

An IP is a 32-bit number comprised of a host number and a network prefix, both of which
are used to uniquely identify each node within a network.To make these addresses more
readable, they are broken up into 4 bytes, or octets, where any 2 bytes are separated by a
period. This is commonly referred to as dotted decimal notation.The first part of an Internet
address identifies the network on which the host resides, while the second part identifies
the particular host on the given network. This creates the two-level addressing
hierarchy.All hosts on a given network share the same network prefix but must have a
unique host number. Similarly, any two hosts on different networks must have different
network prefixes but may have the same host number. Subnet masks are 32 bits long and
are typically represented in dotted-decimal (such as 255.255.255.0) or the number of
networking bits (such as /24).

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
*Class A addresses 127.0.0.0 to 127.255.255.255 cannot be used and is reserved for
loopback and diagnostic functions.

The host's formula will tell you how many hosts will be allowed on a network that has a
certain subnet mask. The host's formula is 2n - 2. The "n" in the host's formula represents
the number of 0s in the subnet mask, if the subnet mask were converted to binary.

Network Masks
A network mask helps you know which portion of the address identifies the network and
which portion of the address identifies the node. Class A, B, and C networks have default
masks, also known as natural masks, as shown here:

Class A: 255.0.0.0
Class B: 255.255.0.0
Class C: 255.255.255.0
An IP address on a Class A network that has not been subnetted would have an
address/mask pair similar to: 8.20.15.1 255.0.0.0. In order to see how the mask helps you
identify the network and node parts of the address, convert the address and mask to
binary numbers.
8.20.15.1 = 00001000.00010100.00001111.00000001
255.0.0.0 = 11111111.00000000.00000000.00000000
Once you have the address and the mask represented in binary, then identification of the
network and host ID is easier. Any address bits which have corresponding mask bits set to
1 represent the network ID. Any address bits that have corresponding mask bits set to 0
represent the node ID.

8.20.15.1 = 00001000.00010100.00001111.00000001
255.0.0.0 = 11111111.00000000.00000000.00000000
-----------------------------------
net id | host id

netid = 00001000 = 8
hostid = 00010100.00001111.00000001 = 20.15.1

A subnet mask is what tells the computer what part of the IP address is the
network and what part is for the host computers on that network.

Subnetting
Subnetting is a process of breaking large network in small networks known as subnets.
Subnetting happens when we extend default boundary of subnet mask. Basically we
borrow host bits to create networks. Let's take a example

Being a network administrator you are asked to create two networks, each will host 30
systems.Single class C IP range can fulfill this requirement, still you have to purchase 2
class C IP range, one for each network. Single class C range provides 256 total addresses
and we need only 30 addresses, this will waste 226 addresses. These unused addresses
would make additional route advertisements slowing down the network.With subnetting
you only need to purchase single range of class C. You can configure router to take first
26 bits instead of default 24 bits as network bits. In this case we would extend default
boundary of subnet mask and borrow 2 host bits to create networks. By taking two bits
from the host range and counting them as network bits, we can create two new subnets,
and assign hosts them. As long as the two new network bits match in the address, they
belong to the same network. You can change either of the two bits, and you would be in a
new subnet.

Advantage of Subnetting

Subnetting breaks large network in smaller networks and smaller networks are
easier to manage.
Subnetting reduces network traffic by removing collision and broadcast traffic, that
overall improve performance.
Subnetting allows you to apply network security polices at the interconnection
between subnets.
Subnetting allows you to save money by reducing requirement for IP range.
CIDR [ Classless Inter Domain Routing]:CIDR is a slash notation of subnet mask.
CIDR tells us number of on bits in a network address.

Class A has default subnet mask 255.0.0.0. that means first octet of the subnet
mask has all on bits. In slash notation it would be written as /8, means address has
8 bits on.
Class B has default subnet mask 255.255.0.0. that means first two octets of the
subnet mask have all on bits. In slash notation it would be written as /16, means
address has 16 bits on.
Class C has default subnet mask 255.255.255.0. that means first three octets of the
subnet mask have all on bits. In slash notation it would be written as /24, means
address has 24 bits on.
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

Multiplexing :To combine multiple signals (analog or digital) for transmission over
a single line or media. A common type of multiplexing combines several low-speed
signals for transmission over a single high-speed connection. Multiplexing is done by
using a device called multiplexer (MUX) that combines n input lines to generate one
output line i.e. (many to one). Therefore multiplexer (MUX) has several inputs and one
output. At the receiving end, a device called demultiplexer (DEMUX) is used that
separates signal into its component signals. So DEMUX has one input and several
outputs.

Three major multiplexing techniques are:

Frequency division multiplexing

Wavelength division multiplexing
Time division multiplexing
Frequency Division Multiplexing (FDM) :Short for frequency division multiplexing, a
multiplexing technique that uses different frequencies to combine multiple streams of
data for transmission over a communications medium. FDM assigns a discrete carrier
frequency to each data stream and then combines many modulated carrier
frequencies for transmission. For example, television transmitters use FDM to
broadcast several channels at once.

Time Division Multiplexing (TDM) :Short for Time Division Multiplexing, a type of
multiplexing that combines data streams by assigning each stream a different time slot
in a set. TDM repeatedly transmits a fixed sequence of time slots over a single
transmission channel. Within T-Carrier systems, such as T-1 and T-3, TDM
combines Pulse Code Modulated (PCM) streams created for each conversation or data
stream.

Wavelength Division Multiplexing (WDM) :Short for wavelength division

multiplexing, a type of multiplexing developed for use on optical fiber. WDM
modulates each of several data streams onto a different part of the light
spectrum. WDM is the optical equivalent of FDM.

What is a Network Protocol

Rules of Network Protocol include guidelines that regulate the following characteristics of a
network: access method, allowed physical topologies, types of cabling, and speed of data
transfer.
Types of Network Protocols

The most common network protocols are:

Ethernet
Local Talk
Token Ring
FDDI
ATM
The followings are some commonly used network symbols to draw different kinds of
network protocols.
Ethernet
The Ethernet protocol is by far the most widely used one. Ethernet uses an access method
called CSMA/CD (Carrier Sense Multiple Access/Collision Detection). This is a system where
each computer listens to the cable before sending anything through the network. If the
network is clear, the computer will transmit. If some other nodes have already transmitted on
the cable, the computer will wait and try again when the line is clear. Sometimes, two
computers attempt to transmit at the same instant. A collision occurs when this happens. Each
computer then backs off and waits a random amount of time before attempting to retransmit.
With this access method, it is normal to have collisions. However, the delay caused by
collisions and retransmitting is very small and does not normally effect the speed of
transmission on the network.
The Ethernet protocol allows for linear bus, star, or tree topologies. Data can be transmitted
over wireless access points, twisted pair, coaxial, or fiber optic cable at a speed of 10 Mbps up
to 1000 Mbps.
Fast Ethernet
To allow for an increased speed of transmission, the Ethernet protocol has developed a new
standard that supports 100 Mbps. This is commonly called Fast Ethernet. Fast Ethernet
requires the application of different, more expensive network concentrators/hubs and
network interface cards. In addition, category 5 twisted pair or fiber optic cable is necessary.
Fast Ethernet is becoming common in schools that have been recently wired.
Local Talk
Local Talk is a network protocol that was developed by Apple Computer, Inc. for Macintosh
computers. The method used by Local Talk is called CSMA/CA (Carrier Sense Multiple Access
with Collision Avoidance). It is similar to CSMA/CD except that a computer signals its intent to
transmit before it actually does so. Local Talk adapters and special twisted pair cable can be
used to connect a series of computers through the serial port. The Macintosh operating
system allows the establishment of a peer-to-peer network without the need for additional
software. With the addition of the server version of AppleShare software, a client/server
network can be established.
The Local Talk protocol allows for linear bus, star, or tree topologies using twisted pair cable. A
primary disadvantage of Local Talk is low speed. Its speed of transmission is only 230 Kbps.
Token Ring
The Token Ring protocol was developed by IBM in the mid-1980s. The access method used
involves token-passing. In Token Ring, the computers are connected so that the signal travels
around the network from one computer to another in a logical ring. A single electronic token
moves around the ring from one computer to the next. If a computer does not have
information to transmit, it simply passes the token on to the next workstation. If a computer
wishes to transmit and receives an empty token, it attaches data to the token. The token then
proceeds around the ring until it comes to the computer for which the data is meant. At this
point, the data is captured by the receiving computer. The Token Ring protocol requires a star-
wired ring using twisted pair or fiber optic cable. It can operate at transmission speeds of 4
Mbps or 16 Mbps. Due to the increasing popularity of Ethernet, the use of Token Ring in
school environments has decreased.
FDDI
Fiber Distributed Data Interface (FDDI) is a network protocol that is used primarily to
interconnect two or more local area networks, often over large distances. The access method
used by FDDI involves token-passing. FDDI uses a dual ring physical topology. Transmission
normally occurs on one of the rings; however, if a break occurs, the system keeps information
moving by automatically using portions of the second ring to create a new complete ring. A
major advantage of FDDI is high speed. It operates over fiber optic cable at 100 Mbps.
ATM
Asynchronous Transfer Mode (ATM) is a network protocol that transmits data at a speed of
155 Mbps and higher. ATM works by transmitting all data in small packets of a fixed size;
whereas, other protocols transfer variable length packets. ATM supports a variety of media
such as video, CD-quality audio, and imaging. ATM employs a star topology, which can work
with fiber optic as well as twisted pair cable.
ATM is most often used to interconnect two or more local area networks. It is also frequently
used by Internet Service Providers to utilize high-speed access to the Internet for their clients.
As ATM technology becomes more cost-effective, it will provide another solution for
constructing faster local area networks.
Gigabit Ethernet
The most latest development in the Ethernet standard is a protocol that has a transmission
speed of 1 Gbps. Gigabit Ethernet is primarily used for backbones on a network at this time. In
the future, it will probably also be used for workstation and server connections. It can be used
with both fiber optic cabling and copper. The 1000BaseTX, the copper cable used for Gigabit
Ethernet, became the formal standard in 1999.
Protocol Cable Speed Topology
Ethernet Twisted Pair, Coaxial, 10 Mbps Linear Bus, Star,
Fiber Tree
Fast Twisted Pair, Fiber 100 Mbps Star
Ethernet
Local Talk Twisted Pair .23 Mbps Linear Bus or Star
Token Ring Twisted Pair 4 Mbps - 16 Star-Wired Ring
Mbps
FDDI Fiber 100 Mbps Dual ring
ATM Twisted Pair, Fiber 155-2488 Mbps Linear Bus, Star,
Tree
Compare the Network Protocols

Carrier Sensed Multiple Access (CSMA) : CSMA is a network access method used on shared
network topologies such as Ethernet to control access to the network. Devices attached to
the network cable listen (carrier sense) before transmitting. If the channel is in use, devices
wait before transmitting. MA (Multiple Access) indicates that many devices can connect to
and share the same network. All devices have equal access to use the network when it is
clear.

CSMA/CD (Carrier Sense Multiple Access/Collision Detection)

In CSMA/CD (Carrier Sense Multiple Access/Collision Detection) Access Method, every host
has equal access to the wire and can place data on the wire when the wire is free from traffic.
When a host want to place data on the wire, it will sense the wire to find whether there is a
signal already on the wire. If there is traffic already in the medium, the host will wait and if
there is no traffic, it will place the data in the medium. But, if two systems place data on the
medium at the same instance, they will collide with each other, destroying the data. If the
data is destroyed during transmission, the data will need to be retransmitted. After collision,
each host will wait for a small interval of time and again the data will be retransmitted, to
avoid collision again.

CSMA/CA (Carrier Sense Multiple Access/Collision Avoidance)

In CSMA/CA, before a host sends real data on the wire it will sense the wire to check if the
wire is free. If the wire is free, it will send a piece of dummy data on the wire to see whether
it collides with any other data. If it does not collide, the host will assume that the real data also
will not collide.
Token Passing

In CSMA/CD and CSMA/CA the chances of collisions are there. As the number of hosts in the
network increases, the chances of collisions also will become more. In token passing, when a
host want to transmit data, it should hold the token, which is an empty packet. The token is
circling the network in a very high speed. If any workstation wants to send data, it should wait
for the token. When the token has reached the workstation, the workstation can take the
token from the network, fill it with data, mark the token as being used and place the token
back to the network

TCP/IP Reference Model

TCP/IP means Transmission Control Protocol and Internet Protocol. It is the network model used in
the current Internet architecture as well. Protocols are set of rules which govern every possible
communication over a network. These protocols describe the movement of data between the
source and destination or the internet. These protocols offer simple naming and addressing
schemes.

Description of different TCP/IP protocols

Layer 1: Host-to-network Layer

1. Lowest layer of the all.

2. Protocol is used to connect to the host, so that the packets can be sent over it.
3. Varies from host to host and network to network.

Layer 2: Internet layer

1. Selection of a packet switching network which is based on a connectionless internetwork

layer is called a internet layer.
2. It is the layer which holds the whole architecture together.
3. It helps the packet to travel independently to the destination.
4. Order in which packets are received is different from the way they are sent.
5. IP (Internet Protocol) is used in this layer.

Layer 3: Transport Layer

1. It decides if data transmission should be on parallel path or single path.

2. Functions such as multiplexing, segmenting or splitting on the data is done by transport
layer.
3. The applications can read and write to the transport layer.
4. Transport layer adds header information to the data.
5. Transport layer breaks the message (data) into small units so that they are handled more
efficiently by the network layer.
6. Transport layer also arrange the packets to be sent, in sequence.

Layer 4: Application Layer

The TCP/IP specifications described a lot of applications that were at the top of the protocol
stack. Some of them were TELNET, FTP, SMTP, DNS etc.

1. TELNET is a two-way communication protocol which allows connecting to a remote

machine and run applications on it.
2. FTP(File Transfer Protocol) is a protocol, that allows File transfer amongst computer users
connected over a network. It is reliable, simple and efficient.
3. SMTP(Simple Mail Transport Protocol) is a protocol, which is used to transport electronic
mail between a source and destination, directed via a route.
4. DNS(Domain Name Server) resolves an IP address into a textual address for Hosts
connected over a network.

Merits of TCP/IP model

1. It operated independently.
2. It is scalable.
3. Client/server architecture.
4. Supports a number of routing protocols.
5. Can be used to establish a connection between two computers.
Demerits of TCP/IP

1. In this, the transport layer does not guarantee delivery of packets.

2. The model cannot be used in any other application.
3. Replacing protocol is not easy.
4. It has not clearly separated its services, interfaces and protocols.

Error Detection and Correction:

Data link layer is layer 2 in OSI model. It is responsible for communications between adjacent
network nodes. It handles the data moving in and out across the physical layer. It also provides a
well defined service to the network layer. Data link layer is divided into two sub layers. The Media
Access Control (MAC) and logical Link Control (LLC).
Data-Link layer ensures that an initial connection has been set up, divides output data into data
frames, and handles the acknowledgements from a receiver that the data arrived successfully. It
also ensures that incoming data has been received successfully by analyzing bit patterns at special
places in the frames.
In the following sections data link layer's functions- Error control and Flow control has been
discussed. After that MAC layer is explained. Multiple access protocols are explained in the MAC
layer section.

Network is responsible for transmission of data from one device to another device. The end to end
transfer of data from a transmitting application to a receiving application involves many steps, each
subject to error. With the error control process, we can be confident that the transmitted and
received data are identical. Data can be corrupted during transmission. For reliable communication,
error must be detected and corrected.
Error control is the process of detecting and correcting both the bit level and packet level errors.
Types of Errors
Single Bit Error
The term single bit error means that only one bit of the data unit was changed from 1 to 0 and 0 to
1.
Burst Error
In term burst error means that two or more bits in the data unit were changed. Burst error is also
called packet level error, where errors like packet loss, duplication, reordering.
Error Detection
Error detection is the process of detecting the error during the transmission between the sender
and the receiver.
Types of error detection

Parity checking
Cyclic Redundancy Check (CRC)
Checksum
Redundancy
Redundancy allows a receiver to check whether received data was corrupted during transmission.
So that he can request a retransmission. Redundancy is the concept of using extra bits for use in
error detection. As shown in the figure sender adds redundant bits (R) to the data unit and sends to
receiver, when receiver gets bits stream and passes through checking function. If no error then
data portion of the data unit is accepted and redundant bits are discarded. otherwise asks for the
retransmission.
Parity checking
Parity adds a single bit that indicates whether the number of 1 bits in the preceding data is even or
odd. If a single bit is changed in transmission, the message will change parity and the error can be
detected at this point. Parity checking is not very robust, since if the number of bits changed is
even, the check bit will be invalid and the error will not be detected.

1. Single bit parity

2. Two dimension parity
Moreover, parity does not indicate which bit contained the error, even when it can detect it. The
data must be discarded entirely, and re-transmitted from scratch. On a noisy transmission medium
a successful transmission could take a long time, or even never occur. Parity does have the
advantage, however, that it's about the best possible code that uses only a single bit of space.
Cyclic Redundancy Check
CRC is a very efficient redundancy checking technique. It is based on binary division of the data
unit, the remainder of which (CRC) is added to the data unit and sent to the receiver. The Receiver
divides data unit by the same divisor. If the remainder is zero then data unit is accepted and
passed up the protocol stack, otherwise it is considered as having been corrupted in transit, and
the packet is dropped.
Sequential steps in CRC are as follows.
Sender follows following steps.

Data unit is composite by number of 0s, which is one less than the divisor.
Then it is divided by the predefined divisor using binary division technique. The remainder is
called CRC. CRC is appended to the data unit and is sent to the receiver.
Receiver follows following steps.

When data unit arrives followed by the CRC it is divided by the same divisor which was used to
find the CRC (remainder).
If the remainder result in this division process is zero then it is error free data, otherwise it is
corrupted.
Checksum
Check sum is the third method for error detection mechanism. Checksum is used in the upper
layers, while Parity checking and CRC is used in the physical layer. Checksum is also on the
concept of redundancy.
In the checksum mechanism two operations to perform.
Checksum generator
Sender uses checksum generator mechanism. First data unit is divided into equal segments of n
bits. Then all segments are added together using 1s complement. Then it complements ones
again. It becomes Checksum and sends along with data unit.
Exp:
If 16 bits 10001010 00100011 is to be sent to receiver.

So the checksum is added to the data unit and sends to the receiver. Final data unit is 10001010
00100011 01010000.
Checksum checker
Receiver receives the data unit and divides into segments of equal size of segments. All segments
are added using 1s complement. The result is completed once again. If the result is zero, data will
be accepted, otherwise rejected.
Exp:
The final data is nonzero then it is rejected.
Error Correction
This type of error control allows a receiver to reconstruct the original information when it has been
corrupted during transmission.
Hamming Code
It is a single bit error correction method using redundant bits.
In this method redundant bits are included with the original data. Now, the bits are arranged such
that different incorrect bits produce different error results and the corrupt bit can be identified. Once
the bit is identified, the receiver can reverse its value and correct the error. Hamming code can be
applied to any length of data unit and uses the relationships between the data and the redundancy
bits.
Algorithm:

1. Parity bits are positions at the power of two (2 r).

2. Rest of the positions is filled by original data.
3. Each parity bit will take care of its bits in the code.
4. Final code will sends to the receiver.
In the above example we calculates the even parities for the various bit combinations. the value for
the each combination is the value for the corresponding r(redundancy)bit. r1 will take care of bit
1,3,5,7,9,11. and it is set based on the sum of even parity bit. the same method for rest of the parity
bits.

If the error occurred at bit 7 which is changed from 1 to 0, then receiver recalculates the same sets
of bits used by the sender. By this we can identify the perfect location of error occurrence. once the
bit is identified the receiver can reverse its value and correct the error.
Flow Control is one important design issue for the Data Link Layer that controls the flow of data
between sender and receiver.
In Communication, there is communication medium between sender and receiver. When Sender
sends data to receiver than there can be problem in below case :
1) Sender sends data at higher rate and receive is too sluggish to support that data rate.
To solve the above problem, FLOW CONTROL is introduced in Data Link Layer. It also works on
several higher layers. The main concept of Flow Control is to introduce EFFICIENCY in Computer
Networks.
Approaches of Flow Control

1. Feed back based Flow Control

2. Rate based Flow Control

Feed back based Flow Control is used in Data Link Layer and Rate based
Flow Control is used in Network Layer.

Feed back based Flow Control

In Feed back based Flow Control, Until sender receives feedback from the receiver, it will not send
next data.
Types of Feedback based Flow Control
A. Stop-and-Wait Protocol
B. Sliding Window Protocol

1. A One-Bit Sliding Window Protocol

2. A Protocol Using Go Back N
3. A Protocol Using Selective Repeat

A. A Simplex Stop-and-Wait Protocol

In this Protocol we have taken the following assumptions:

1. It provides unidirectional flow of data from sender to receiver.

2. The Communication channel is assumed to be error free.
In this Protocol the Sender simply sends data and waits for the acknowledgment from Receiver.
That's why it is called Stop-and-Wait Protocol.
This type is not so much efficient, but it is simplest way of Flow Control.
In this scheme we take Communication Channel error free, but if the Channel has some errors than
receiver is not able to get the correct data from sender so it will not possible for sender to send the
next data (because it will not get acknowledge from receiver). So it will end the communication, to
solve this problem there are two new concepts were introduced.

1. TIMER, if sender was not able to get acknowledgment in the particular time than, it sends
the buffered data once again to receiver. When sender starts to send the data, it starts
timer.
2. SEQUENCE NUMBER, from this the sender sends the data with the specific sequence
number so after receiving the data, receiver sends the data with that sequence number,
and here at sender side it also expect the acknowledgment of the same sequence number.
This type of scheme is called Positive Acknowledgment with Retransmission (PAR).

B. Sliding Window Protocol

Problems Stop wait protocol In the last protocols sender must wait for either positive
acknowledgment from receiver or for time out to send the next frame to receiver. So if the sender is
ready to send the new data, it can not send. Sender is dependent on the receiver. Previous
protocols have only the flow of one sided, means only sender sends the data and receiver just
acknowledge it, so the twice bandwidth is used.
To solve the above problems the Sliding Window Protocol was introduce.
In this, the sender and receiver both use buffer, its of same size, so there is no necessary to wait
for the sender to send the second data, it can send one after one without wait of the receivers
acknowledgment.
And it also solve the problem of uses of more bandwidth, because in this scheme both sender and
receiver uses the channel to send the data and receiver just send the acknowledge with the data
which it want to send to sender, so there is no special bandwidth is used for acknowledgment, so
the bandwidth is saved, and this whole process is called PIGGYBACKING.

Types of Sliding Window Protocol

i. A One-Bit Sliding Window Protocol
ii. A Protocol Using Go Back N
iii. A Protocol Using Selective Repeat

i. A One-Bit Sliding Window Protocol

This protocol has buffer size of one bit, so only possibility for sender and receiver to send and
receive packet is only 0 and 1. This protocol includes Sequence, Acknowledge, and Packet
number.It uses full duplex channel so there is two possibilities:

1. Sender first start sending the data and receiver start sending data after it receive the data.
2. Receiver and sender both start sending packets simultaneously,
First case is simple and works perfectly, but there will be an error in the second one. That error can
be like duplication of the packet, without any transmission error.

ii. A Protocol Using Go Back N

The problem with pipelining is if sender sending 10 packets, but the problem occurs in 8th one than
it is needed to resend whole data. So the protocol called Go back N and Selective Repeat were
introduced to solve this problem.In this protocol, there are two possibility at the receivers end, it
may be with large window size or it may be with window size one.
iii. A Protocol Using Selective Repeat
Protocol using Go back N is good when the errors are rare, but if the line is poor, it wastes a lot of
bandwidth on retransmitted frames. So to provide reliability, Selective repeat protocol was
introduced. In this protocol sender starts it's window size with 0 and grows to some predefined
maximum number. Receiver's window size is fixed and equal to the maximum number of sender's
window size. The receiver has a buffer reserved for each sequence number within its fixed window.
Whenever a frame arrives, its sequence number is checked by the function to see if it falls within
the window, if so and if it has not already been received, it is accepted and stored. This action is
taken whether it is not expected by the network layer.
The data link layer is divided into two sublayers: The Media Access Control (MAC) layer and
the Logical Link Control (LLC) layer. The MAC sublayer controls how a computer on the network
gains access to the data and permission to transmit it. The LLC layer controls frame
synchronization, flow control and error checking.
Mac Layer is one of the sublayers that makeup the datalink layer of the OSI reference Model.MAC
layer is responsible for moving packets from one Network Interface card NIC to another across the
shared channelThe MAC sublayer uses MAC protocols to ensure that signals sent from different
stations across the same channel don't collide.
Different protocols are used for different shared networks, such as Ethernets, Token Rings,
Token Buses, and WANs.

1. ALOHA
ALOHA is a simple communication scheme in which each source in a network sends its data
whenever there is a frame to send without checking to see if any other station is active. After
sending the frame each station waits for implicit or explicit acknowledgment. If the frame
successfully reaches the destination, next frame is sent. And if the frame fails to be received at the
destination it is sent again.

Pure ALOHA ALOHA is the simplest technique in multiple accesses. Basic idea of this mechanism
is a user can transmit the data whenever they want. If data is successfully transmitted then there
isnt any problem. But if collision occurs than the station will transmit again. Sender can detect the
collision if it doesnt receive the acknowledgment from the receiver.
Slotted ALOHA
In ALOHA a newly emitted packet can collide with a packet in progress. If all packets are of the
same length and take L time units to transmit, then it is easy to see that a packet collides with any
other packet transmitted in a time window of length 2L. If this time window is decreased somehow,
than number of collisions decreases and the throughput increase. This mechanism is used in
slotted ALOHA or S-ALOHA. Time is divided into equal slots of Length L. When a station wants to
send a packet it will wait till the beginning of the next time slot.
Advantages of slotted ALOHA:

single active node can continuously transmit at full rate of channel

highly decentralized: only slots in nodes need to be in sync
simple
Disadvantages of slotted ALOHA:

collisions, wasting slots

idle slots
clock synchronization

2. Carrier Sense Multiple Access protocols (CSMA)

With slotted ALOHA, the best channel utilization that can be achieved is 1/e. Several protocols are
developed for improving the performance.Protocols that listen for a carrier and act accordingly are
called carrier sense protocols. Carrier sensing allows the station to detect whether the medium is
currently being used. Schemes that use a carrier sense circuits are classed together as carrier
sense multiple access or CSMA schemes. There are two variants of CSMA. CSMA/CD and
CSMA/CA.The simplest CSMA scheme is for a station to sense the medium, sending packets
immediately if the medium is idle. If the station waits for the medium to become idle it is called
persistent otherwise it is called non persistent.

a. Persistent

When a station has the data to send, it first listens the channel to check if anyone else is
transmitting data or not. If it senses the channel idle, station starts transmitting the data. If it senses
the channel busy it waits until the channel is idle. When a station detects a channel idle, it transmits
its frame with probability P. Thats why this protocol is called p-persistentCSMA. This protocol
applies to slotted channels. When a station finds the channel idle, if it transmits the fame with
probability 1, that this protocol is known as 1 -persistent. 1 -persistent protocol is the most
aggressive protocol.
b. Non-Persistent

Non persistent CSMA is less aggressive compared to P persistent protocol. In this protocol, before
sending the data, the station senses the channel and if the channel is idle it starts transmitting the
data. But if the channel is busy, the station does not continuously sense it but instead of that it
waits for random amount of time and repeats the algorithm. Here the algorithm leads to better
channel utilization but also results in longer delay compared to 1 persistent.

Transmission media is a pathway that carries the information from sender to receiver. We
use different types of cables or waves to transmit data. Data is transmitted normally through
electrical or electromagnetic signals.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
An electrical signal is in the form of current. An electromagnetic signal is series of
electromagnetic energy pulses at various frequencies. These signals can be transmitted through
copper wires, optical fibers, atmosphere, water and vacuum Different Medias have different
properties like bandwidth, delay, cost and ease of installation and maintenance. Transmission
media is also called Communication channel.

Bounded/Guided Transmission Media

It is the transmission media in which signals are confined to a specific path using wire or cable. The
types of Bounded/ Guided are discussed below.

Twisted Pair Cable

This cable is the most commonly used and is cheaper than others. It is lightweight, cheap, can
be installed easily, and they support many different types of network. Some important points :

Its frequency range is 0 to 3.5 kHz.

Typical attenuation is 0.2 dB/Km @ 1kHz.
Typical delay is 50 s/km.
Repeater spacing is 2km.

Twisted Pair is of two types

Unshielded Twisted Pair (UTP)
Shielded Twisted Pair (STP)

Unshielded Twisted Pair Cable

It is the most common type of telecommunication when compared with Shielded Twisted Pair
Cable which consists of two conductors usually copper, each with its own colour plastic
insulator. Identification is the reason behind coloured plastic insulation.
UTP cables consist of 2 or 4 pairs of twisted cable. Cable with 2 pair use RJ-11 connector and 4
pair cable use RJ-45 connector.
Advantages :
Installation is easy
Flexible
Cheap
It has high speed capacity,
100 meter limit
Higher grades of UTP are used in LAN technologies like Ethernet.
It consists of two insulating copper wires (1mm thick). The wires are twisted together in a
helical form to reduce electrical interference from similar pair.
Disadvantages :
Bandwidth is low when compared with Coaxial Cable
Provides less protection from interference.

Shielded Twisted Pair Cable

This cable has a metal foil or braided-mesh covering which encases each pair of insulated
conductors. Electromagnetic noise penetration is prevented by metal casing. Shielding also
eliminates crosstalk (explained in KEY TERMS Chapter).It has same attenuation as unshielded
twisted pair. It is faster the unshielded and coaxial cable. It is more expensive than coaxial and
unshielded twisted pair.
STP cables are often used in Ethernet networks, particularly fast-data-rate Ethernets. The
effectiveness of the additional covering varies according to the substance used for the shielding,
such as:

Frequency
Thickness
Type of electromagnetic noise field
Distance from the shield to the noise source
Shield discontinuity
Grounding practices

Some STP cablings make use of a thick copper braided shield which makes the cable thicker,
heavier, and in turn much more difficult for installation as compared to the UTP cables.

COAXIAL CABLE:
Coaxial cable is very common & widely used commutation media. For example TV wire is
usually coaxial.
Coaxial cable gets its name because it contains two conductors that are parallel to each other.
The center conductor in the cable is usually copper. The copper can be either a solid wire or
stranded martial.
Outside this central Conductor is a non-conductive material. It is usually white, plastic material
used to separate the inner Conductor form the outer Conductor. The other Conductor is a fine
mesh made from Copper. It is used to help shield the cable form EMI.
Outside the copper mesh is the final protective cover. (as shown in Fig)
The actual data travels through the center conductor in the cable. EMI interference is caught
by outer copper mesh. There are different types of coaxial cable vary by gauge & impedance.
Gauge is the measure of the cable thickness. It is measured by the Radio grade measurement,
or RG number. The high the RG number, the thinner the central conductor core, the lower the
number the thicker the core.
Here the most common coaxial standards.
50-Ohm RG-7 or RG-11 : used with thick Ethernet.
50-Ohm RG-58 : used with thin Ethernet
75-Ohm RG-59 : used with cable television
93-Ohm RG-62 : used with ARCNET.

Fiber Optics
Fiber optic cable uses electrical signals to transmit data. It uses light. In fiber optic cable light
only moves in one direction for two way communication to take place a second connection
must be made between the two devices. It is actually two stands of cable. Each stand is
responsible for one direction of communication. A laser at one device sends pulse of light
through this cable to other device. These pulses translated into 1s and 0s at the other end.
In the center of fiber cable is a glass stand or core. The light from the laser moves through this
glass to the other device around the internal core is a reflective material known as CLADDING.
No light escapes the glass core because of this reflective cladding.
Fiber optic cable has bandwidth more than 2 gbps (Gigabytes per Second)
A wireless network enables people to communicate and access applications and information without
wires. This provides freedom of movement and the ability to extend applications to different parts of a
building, city, or nearly anywhere in the world. Wireless networks allow people to interact with e-mail
or browse the Internet from a location that they prefer.
Many types of wireless communication systems exist, but a distinguishing attribute of a wireless
network is that communication takes place between computer devices. These devices include personal
digital assistants (PDAs), laptops, personal computers (PCs), servers, and printers. Computer devices
have processors, memory, and a means of interfacing with a particular type of network. Traditional cell
phones don't fall within the definition of a computer device; however, newer phones and even audio
headsets are beginning to incorporate computing power and network adapters. Eventually, most
electronics will offer wireless network connections.
As with networks based on wire, or optical fiber, wireless networks convey information between
computer devices. The information can take the form of e-mail messages, web pages, database
records, streaming video or voice. In most cases, wireless networks transfer data, such as e-mail
messages and files, but advancements in the performance of wireless networks is enabling support for
video and voice communications as well.

The Institute of Electrical and Electronics Engineers(IEEE) is a standards setting body. Each of their
standards is numbered and a subset of the number is the actual standard. The 802 family of standards
is ones developed for computer networking.

IEEE, or Institute of Electrical and Electronics Engineers, is a standards setting body. They create
standards for things like networking so products can be compatible with one another. You may have
heard of IEEE 802.11b - this is the standard that IEEE has set (in this example, wireless-b networking).

Several networking technologies: 802.2, 802.3, 802.5, 802.11, and FDDI. Each of these is just a standard
set of technologies, each with its own characteristics.

802.2 Logical Link Control

The technical definition for 802.2 is "the standard for the upper Data Link Layer sublayer also known as
the Logical Link Control layer. It is used with the 802.3, 802.4, and 802.5 standards (lower DL
sublayers)."

802.2 "specifies the general interface between the network layer (IP, IPX, etc) and the data link layer
(Ethernet, Token Ring, etc).

Basically, think of the 802.2 as the "translator" for the Data Link Layer. 802.2 is concerned with
managing traffic over the physical network. It is responsible for flow and error control. The Data Link
Layer wants to send some data over the network, 802.2 Logical Link Control helps make this possible. It
also helps by identifying the line protocol, like NetBIOS, or Netware.

The LLC acts like a software bus allowing multiple higher layer protocols to access one or more lower
layer networks. For example, if you have a server with multiple network interface cards, the LLC will
forward packers from those upper layer protocols to the appropriate network interface. This allows the
upper layer protocols to not need specific knowledge of the lower layer networks in use.

802.3 Ethernet

Now that we have an overview of the OSI model, we can continue on these topics. I hope you have a
clearer picture of the network model and where things fit on it.

802.3 is the standard which Ethernet operates by. It is the standard for CSMA/CD (Carrier Sense
Multiple Access with Collision Detection). This standard encompasses both the MAC and Physical Layer
standards.

CSMA/CD is what Ethernet uses to control access to the network medium (network cable). If there is
no data, any node may attempt to transmit, if the nodes detect a collision, both stop transmitting and
wait a random amount of time before retransmitting the data.

The original 802.3 standard is 10 Mbps (Megabits per second). 802.3u defined the 100 Mbps (Fast
Ethernet) standard, 802.3z/802.3ab defined 1000 Mbps Gigabit Ethernet, and 802.3ae define 10
Gigabit Ethernet.

Commonly, Ethernet networks transmit data in packets, or small bits of information. A packet can be a
minimum size of 72 bytes or a maximum of 1518 bytes.

The most common topology for Ethernet is the star topology.

802.5 Token Ring

As we mentioned earlier when discussing the ring topology, Token Ring was developed primarily by
IBM. Token ring is designed to use the ring topology and utilizes a token to control the transmission of
data on the network.

The token is a special frame which is designed to travel from node to node around the ring. When it
does not have any data attached to it, a node on the network can modify the frame, attach its data and
transmit. Each node on the network checks the token as it passes to see if the data is intended for that
node, if it is; it accepts the data and transmits a new token. If it is not intended for that node, it
retransmits the token on to the next node.

The token ring network is designed in such a way that each node on the network is guaranteed access
to the token at some point. This equalizes the data transfer on the network. This is different from an
Ethernet network where each workstation has equal access to grab the available bandwidth, with the
possible of a node using more bandwidth than other nodes.

Originally, token ring operated at a speed of about 4 Mbps and 16 Mbps. 802.5t allows for 100 Mbps
speeds and 802.5v provides for 1 Gbps over fibber.

Token ring can be run over a star topology as well as the ring topology.

There are three major cable types for token ring: Unshielded twisted pair (UTP), Shielded twisted pair
(STP), and fibber.

Token ring utilizes a Multi-station Access Unit (MAU) as a central wiring hub. This is also sometimes
called a MSAU when referring to token ring networks.

802.11 Wireless Network Standards

802.11 is the collection of standards setup for wireless networking. You are probably familiar with the
three popular standards: 802.11a, 802.11b, 802.11g and latest one is 802.11n. Each standard uses a
frequency to connect to the network and has a defined upper limit for data transfer speeds.

802.11a was one of the first wireless standards. 802.11a operates in the 5Ghz radio band and can
achieve a maximum of 54Mbps. Wasn't as popular as the 802.11b standard due to higher prices and
lower range.

802.11b operates in the 2.4Ghz band and supports up to 11 Mbps. Range of up to several hundred feet
in theory. The first real consumer option for wireless and very popular.

802.11g is a standard in the 2.4Ghz band operating at 54Mbps. Since it operates in the same band as
802.11b, 802.11g is compatible with 802.11b equipment. 802.11a is not directly compatible with
802.11b or 802.11g since it operates in a different band.

Wireless LANs primarily use CSMA/CA - Carrier Sense Multiple Access/Collision Avoidance. It has a
"listen before talk" method of minimizing collisions on the wireless network. This results in less need
for retransmitting data.

Wireless standards operate within a wireless topology.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
NETWORK SECURITY

Cryptography can reformat and transform our data, making it safer on its trip between
computers. The technology is based on the essentials of secret codes, augmented by modern
mathematics that protects our data in powerful ways.

Computer Security - generic name for the collection of tools designed to protect data and to
thwart hackers

Network Security - measures to protect data during their transmission

Internet Security - measures to protect data during their transmission over a collection of
interconnected networks.

Security Attacks, Services and Mechanisms: To assess the security needs of an organization
effectively, the manager responsible for security needs some systematic way of defining the
requirements for security and characterization of approaches to satisfy those requirements. One
approach is to consider three aspects of information security:
Security attack Any action that compromises the security of information owned by an
organization.
Security mechanism A mechanism that is designed to detect, prevent or recover from a
security attack.
Security service A service that enhances the security of the data processing systems and the
information transfers of an organization. The services are intended to counter security attacks
and they make use of one or more security mechanisms to provide the service.

Basic Concepts:

Cryptography:The art or science encompassing the principles and methods of transforming an

intelligible message into one that is unintelligible, and then retransforming that message back to
its original form

Plaintext The original intelligible message

Cipher text The transformed message

Cipher An algorithm for transforming an intelligible message into one that is unintelligible by
transposition and/or substitution methods

Key Some critical information used by the cipher, known only to the sender& receiver

Encipher (encode) The process of converting plaintext to cipher text using a cipher and a key

Decipher (decode) the process of converting cipher text back into plaintext using a cipher and
a key

Cryptanalysis The study of principles and methods of transforming an unintelligible message

back into an intelligible message without knowledge of the key. Also called code
breaking.Cryptanalysis uses mathematical formulas to search for algorithm vulnerabilities and
break into cryptography or information security systems.

Cryptanalysis attack types include:

Known-Plaintext Analysis (KPA): Attacker decrypt ciphertexts with known partial plaintext.
Chosen-Plaintext Analysis (CPA): Attacker uses ciphertext that matches arbitrarily selected
plaintext via the same algorithm technique.
Ciphertext-Only Analysis (COA): Attacker uses known ciphertext collections.
Man-in-the-Middle (MITM) Attack: Attack occurs when two parties use message or key
sharing for communication via a channel that appears secure but is actually compromised.
Attacker employs this attack for the interception of messages that pass through the
communications channel. Hash functions prevent MITM attacks.
Adaptive Chosen-Plaintext Attack (ACPA): Similar to a CPA, this attack uses chosen plaintext
and ciphertext based on data learned from past encryptions.
Cryptology Both cryptography and cryptanalysis

Code An algorithm for transforming an intelligible message into an unintelligible one using a
code-book

Cryptography:

Cryptographic systems are generally classified along 3 independent dimensions:

Type of operations used for transforming plain text to cipher text All the encryption algorithms
are based on two general principles: substitution, in which each element in the plaintext is
mapped into another element, and transposition, in which elements in the plaintext are
rearranged.

The number of keys used If the sender and receiver uses same key then it is said to be
symmetric key (or) single key (or) conventional encryption. If the sender and receiver use
different keys then it is said to be public key encryption.

The way in which the plain text is processed A block cipher processes the input and block of
elements at a time, producing output block for each input block. A stream cipher processes the
input elements continuously, producing output element one at a time, as it goes along.

Cryptanalysis:

The process of attempting to discover X or K or both is known as cryptanalysis. The strategy

used by the cryptanalysis depends on the nature of the encryption scheme and the information
available to the cryptanalyst.

There are various types of cryptanalytic attacks based on the amount of information known
to the cryptanalyst.

Cipher text only A copy of cipher text alone is known to the cryptanalyst. Known plaintext
The cryptanalyst has a copy of the cipher text and the corresponding plaintext.
Chosen plaintext The cryptanalysts gains temporary access to the encryption machine. They
cannot open it to find the key, however; they can encrypt a large number of suitably chosen
plaintexts and try to use the resulting cipher texts to deduce the key.
Chosen cipher text The cryptanalyst obtains temporary access to the decryption machine,
uses it to decrypt several string of symbols, and tries to use the results to deduce the key.

Diffie-Hellman:

a method of exchanging cryptographic keys

establishes a shared secret that can be used for secret communications
vulnerable to Man-in-the-middle attack
Key identity: (gens1)s2 = (gens2)s1 = shared secret (mod prime)
Where:
o gen is an integer whose powers generate all integer in [1, prime) (mod prime)
o s1 and s2 are the individuals' "secrets", only used to generate the symmetric key

RSA is used to come up with a public/private key pair for asymmetric ("public-key")
encryption:
Working: (based upon the above paint example)

alice and bob produces a mix based upon their secret colour
exchange the mix between them
finalize a common secret

RSA:

Used to perform "true" public-key cryptography

an encryption algorithm
very slow for bulk data encryption
Key identity: (me)d = m (mod n) (lets you recover the encrypted message)
Where:
o n = prime1 prime2 (n is publicly used for encryption)
o = (prime1 - 1) (prime2 - 1) (Euler's totient function)
o e is such that 1 < e < , and (e, ) are coprime (e is publicly used for
encryption)
o d e = 1 (mod ) (the modular inverse d is privately used for decryption)

Working:

sender encrypts the data to be transferred using using the public key of the
recipient
receiver decrypts the encrypted data using his private key

Web application security is the process of securing confidential data stored online from
unauthorized access and modification. This is accomplished by enforcing stringent policy
measures. Security threats can compromise the data stored by an organization is hackers
with malicious intentions try to gain access to sensitive information.
The aim of Web application security is to identify the following:

Critical assets of the organization

Genuine users who may access the data
Level of access provided to each user
Various vulnerabilities that may exist in the application
Data criticality and risk analysis on data exposure
Appropriate remediation measures

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Most commonly, the following tactics are used in to attack these applications:

SQL Injection
XSS (Cross Site Scripting)
Remote Command Execution
Path Traversal

1)SQL Injection: SQL injection is a type of security exploit in which the attacker adds
Structured Query Language (SQL) code to a Web form input box to gain access to
resources or make changes to data. An SQL query is a request for some action to be
performed on a database. Typically, on a Web form for user authentication, when a user
enters their name and password into the text boxes provided for them, those values are
inserted into a SELECT query. If the values entered are found as expected, the user is
allowed access; if they aren't found, access is denied. However, most Web forms have no
mechanisms in place to block input other than names and passwords. Unless such
precautions are taken, an attacker can use the input boxes to send their own request to
the database, which could allow them to download the entire database or interact with it in
other illicit ways and by injecting a SQL statement, like ) OR 1=1--, the attacker can
access information stored in the web sites database. Of course, the example used above
represents a relatively simple SQL statement. Ones used by attackers are often much
more sophisticated if they know what the tables in the database are since these complex
statements can generally produce better results.

SQL injection is mostly known as an attack vector for websites.

2)Cross Site Scripting: Cross-Site Scripting (XSS) attacks are a type of injection, in
which malicious scripts are injected into otherwise benign and trusted web sites. XSS
attacks occur when an attacker uses a web application to send malicious code, generally
in the form of a browser side script, to a different end user. Flaws that allow these attacks
to succeed are quite widespread and occur anywhere a web application uses input from a
user within the output it generates without validating or encoding it.An attacker can use
XSS to send a malicious script to an unsuspecting user. The end users browser has no
way to know that the script should not be trusted, and will execute the script. Because it
thinks the script came from a trusted source, the malicious script can access any cookies,
session tokens, or other sensitive information retained by the browser and used with that
site. These scripts can even rewrite the content of the HTML page.
3)Remote Command Execution:Remote Command Execution vulnerabilities allow
attackers to pass arbitrary commands to other applications. In severe cases, the attacker
can obtain system level privileges allowing them to attack the servers from a remote
location and execute whatever commands they need for their attack to be successful.

4)Path Traversal:Path Traversal vulnerabilities give the attacker access to files,

directories, and commands that generally are not accessible because they reside outside
the normal realm of the web document root directory. Unlike the other vulnerabilities
discussed, Path Traversal exploits exist due to a security design error - not a coding error.

HTTPS was originally used mainly to secure sensitive web traffic such as financial
transactions, but it is now common to see it used by default on many sites we use
in our day to day lives such as social networking and search engines. The HTTPS
protocol uses the Transport Layer Security (TLS) protocol, the successor to the
Secure Sockets Layer (SSL) protocol, to secure communications. When configured
and used correctly, it provides protection against eavesdropping and tampering,
along with a reasonable guarantee that a website is the one we intend to be using.
Or, in more technical terms, it provides confidentiality and data integrity, along with
authentication of the website's identity.
IPSec:IPsec (Internet Protocol Security) is a framework for a set of protocols for security at
the network or packet processing layer of network communication. It is an Internet
Engineering Task Force (IETF) standard suite of protocols that provides data
authentication, integrity, and confidentiality as data is transferred between communication
points across IP networks. IPSec provides data security at the IP packet level. A packet is
a data bundle that is organized for transmission across a network, and it includes a header
and payload (the data in the packet). IPSec emerged as a viable network security standard
because enterprises wanted to ensure that data could be securely transmitted over the
Internet. IPSec protects against possible security exposures by protecting data while in
transit.

IPSec contains the following elements:

1)Encapsulating Security Payload (ESP): Encapsulating Security Payload (ESP) is a

member of the IPsec protocol suite. In IPsec it provides origin authenticity, integrity and
confidentiality protection of packets. ESP also supports encryption-only and
authentication-only configurations, but using encryption without authentication is strongly
discouraged because it is insecure.Unlike Authentication Header (AH), ESP in transport
mode does not provide integrity and authentication for the entire IP packet. However, in
Tunnel Mode, where the entire original IP packet is encapsulated with a new packet
header added, ESP protection is afforded to the whole inner IP packet (including the inner
header) while the outer header (including any outer IPv4 options or IPv6 extension
headers) remains unprotected. ESP operates directly on top of IP, using IP protocol
number 50.

The ESP header contains the following fields:

Security Parameters Index Identifies, when used in combination with the
destination address and the security protocol (AH or ESP), the correct security
association for the communication. The receiver uses this value to determine the
security association with which this packet should be identified.
Sequence Number Provides anti-replay protection for the SA. It is 32-bit,
incrementally increasing number (starting from 1) that indicates the packet number
sent over the security association for the communication. The sequence number is
never allowed to cycle. The receiver checks this field to verify that a packet for a
security association with this number has not been received already. If one has
been received, the packet is rejected.

The ESP trailer contains the following fields:

Padding 0 to 255 bytes is used for 32-bit alignment and with the block size of the
block cipher.
Padding Length Indicates the length of the Padding field in bytes. This field is
used by the receiver to discard the Padding field.
Next Header Identifies the nature of the payload, such as TCP or UDP.

The ESP Authentication Trailer contains the following field:

Authentication Data Contains the Integrity Check Value (ICV), and a message
authentication code that is used to verify the sender's identity and message integrity. The
ICV is calculated over the ESP header, the payload data and the ESP trailer.

2)Authentication Header (AH):Authentication Header (AH) is a member of the IPsec

protocol suite. AH guarantees connectionless integrity and data origin authentication of IP
packets. Further, it can optionally protect against replay attacks by using the sliding
window technique and discarding old packets (see below).

In IPv4, the AH protects the IP payload and all header fields of an IP datagram
except for mutable fields (i.e. those that might be altered in transit), and also IP
options such as the IP Security Option (RFC 1108). Mutable (and therefore
unauthenticated) IPv4 header fields are DSCP/ToS, ECN, Flags, Fragment Offset,
TTL and Header Checksum.
In IPv6, the AH protects most of the IPv6 base header, AH itself, non-mutable
extension headers after the AH, and the IP payload. Protection for the IPv6 header
excludes the mutable fields: DSCP, ECN, Flow Label, and Hop Limit.

AH operates directly on top of IP, using IP protocol number 51.

3)Internet Key Exchange (IKE): The Internet Key Exchange (IKE) is an IPsec (Internet
Protocol Security) standard protocol used to ensure security for virtual private network
(VPN) negotiation and remote host or network access. Specified in IETF Request for
Comments (RFC) 2409, IKE defines an automatic means of negotiation and authentication
for IPsec security associations (SA). Security associations are security policies defined for
communication between two or more entities; the relationship between the entities is
represented by a key. The IKE protocol ensures security for SA communication without the
preconfiguration that would otherwise be required.

Benefits provided by IKE include:

Eliminates the need to manually specify all the IPSec security parameters in the
crypto maps at both peers.
Allows you to specify a lifetime for the IPSec security association.
Allows encryption keys to change during IPSec sessions.
Allows IPSec to provide anti-replay services.
Permits Certification Authority (CA) support for a manageable, scalable IPSec
implementation.
Allows dynamic authentication of peers.

Kerberos is an authentication protocol and a software suite implementing this protocol.

Kerberos uses symmetric cryptography to authenticate clients to services and vice versa.
For example, Windows servers use Kerberos as the primary authentication mechanism,
working in conjunction with Active Directory to maintain centralized user information. Other
possible uses of Kerberos include allowing users to log into other machines in a local-area
network, authentication for web services, authenticating email client and servers, and
authenticating the use of devices such as printers.Kerberos is a protocol for authenticating
service requests between trusted hosts across an untrusted network.

Kerberos was created by MIT as a solution to these network security problems. The
Kerberos protocol uses strong cryptography so that a client can prove its identity to a
server (and vice versa) across an insecure network connection. After a client and server
has used Kerberos to prove their identity, they can also encrypt all of their communications
to assure privacy and data integrity as they go about their business.

Kerberos uses the concept of a ticket as a token that proves the identity of a user. Tickets
are digital documents that store session keys. They are typically issued during a login
session and then can be used instead of passwords for any Kerberized services. During
the course of authentication, a client receives two tickets:
A ticket-granting ticket (TGT), which acts as a global identifier for a user and a session
key
A service ticket, which authenticates a user to a particular service
These tickets include time stamps that indicate an expiration time after which they become
invalid. This expiration time can be set by Kerberos administrators depending on the
service.

To accomplish secure authentication, Kerberos uses a trusted third party known as a key
distribution center (KDC), which is composed of two components, typically integrated
into a single server:
An authentication server (AS), which performs user authentication
A ticket-granting server (TGS), which grants tickets to users
The authentication server keeps a database storing the secret keys of the users and
services. The secret key of a user is typically generated by performing a one-way hash of
the user-provided password. Kerberos is designed to be modular, so that it can be used
with a number of encryption protocols, with AES being the default cryptosystem.
Kerberos aims to centralize authentication for an entire networkrather than storing
sensitive authentication information at each users machine, this data is only maintained in
one presumably secure location.

To start the Kerberos authentication process, the initiating client sends a request to an
authentication server for access to a service. The initial request is sent as plaintext
because no sensitive information is included in the request.The authentication server
retrieves the initiating client's private key, assuming the initiating client's username is in the
KDC database. If the initiating client's username cannot be found in the KDC database, the
client cannot be authenticated and the authentication process stops. If the client's
username can be found in the KDC database, the authentication server generates a
session key and a ticket granting ticket. The ticket granting ticket is timestamped and
encrypted by the authentication server with the initiating client's password.The initiating
client is then prompted for a password; if what is entered matches the password in the
KDC database, the encrypted ticket granting ticket sent from the authentication server is
decrypted and used to request a credential from the ticket granting server for the desired
service. The client sends the ticket granting ticket to the ticket granting server, which may
be physically running on the same hardware as the authentication server, but performing a
different role.

The ticket granting service carries out an authentication check similar to that performed by
the authentication server, but this time sends credentials and a ticket to access the
requested service. This transmission is encrypted with a session key specific to the user
and service being accessed. This proof of identity can be used to access the requested
"kerberized" service, which, once having validated the original request, will confirm its
identity to the requesting system.The timestamped ticket sent by the ticket granting service
allows the requesting system to access the service using a single ticket for a specific time
period without having to be re-authenticated. Making the ticket valid for a limited time
period makes it less likely that someone else will be able to use it later; it is also possible
to set the maximum lifetime to 0, in which case service tickets will not expire. Microsoft
recommends a maximum lifetime of 600 minutes for service tickets; this is the default
value in Windows Server implementations of Kerberos.

Kerberos Advantages
The Kerberos protocol is designed to be secure even when performed over an insecure
network.
Since each transmission is encrypted using an appropriate secret key, an attacker cannot
forge a valid ticket to gain unauthorized access to a service without compromising an
encryption key or breaking the underlying encryption algorithm, which is assumed to be
secure.
Kerberos is also designed to protect against replay attacks, where an attacker
eavesdrops legitimate Kerberos communications and retransmits messages from an
authenticated party to perform unauthorized actions.
The inclusion of time stamps in Kerberos messages restricts the window in which an
attacker can retransmit messages.
Tickets may contain the IP addresses associated with the authenticated party to prevent
replaying messages from a different IP address.
Kerberized services make use of a replay cache, which stores previous authentication
tokens and detects their reuse.
Kerberos makes use of symmetric encryption instead of public-key encryption, which
makes Kerberos computationally efficient
The availability of an open-source implementation has facilitated the adoption of
Kerberos.

Kerberos Disadvantages
Kerberos has a single point of failure: if the Key Distribution Center becomes unavailable,
the authentication scheme for an entire network may cease to function. Larger networks
sometimes prevent such a scenario by having multiple KDCs, or having backup KDCs
available in case of emergency.
If an attacker compromises the KDC, the authentication information of every client and
server on the network would be revealed.
Kerberos requires that all participating parties have synchronized clocks, since time
stamps are used.

Virus: A computer virus is a program, script, or macro designed to cause damage, steal
personal information, modify data, send e-mail, display messages, or some combination of
these actions.When the virus is executed, it spreads by copying itself into or over data
files, programs, or boot sector of a computer's hard drive, or potentially anything else
writable. To help spread an infection the virus writers use detailed knowledge of security
vulnerabilities, zero days, or social engineering to gain access to a host's computer.
Types of Virus:
1)Boot Sector Virus:A Boot Sector Virus infects the first sector of the hard drive, where the
Master Boot Record (MBR) is stored. The Master Boot Record (MBR) stores the disk's
primary partition table and to store bootstrapping instructions which are executed after the
computer's BIOS passes execution to machine code. If a computer is infected with Boot
Sector Virus, when the computer is turned on, the virus launches immediately and is
loaded into memory, enabling it to control the computer.Examples of boot viruses are
polyboot and antiexe.

2)File Deleting Viruses:A File Deleting Virus is designed to delete critical files which are
the part of Operating System or data files.

3)Mass Mailer Viruses:Mass Mailer Viruses search e-mail programs like MS outlook for e-
mail addresses which are stored in the address book and replicate by e-mailing
themselves to the addresses stored in the address book of the e-mail program.

4)Macro Virus: Document or macro viruses are written in a macro language. Such
languages are usually included in advanced applications such as word processing and
spreadsheet programs. The vast majority of known macro viruses replicate using the MS
Office program suite, mainly MS Word and MS Excel, but some viruses targeting other
applications are known as well. The symptoms of infection include the automatic restart of
computer again and again. Commonly known types of macro viruses are Melissa A,
Bablas and Y2K Bug.
5)File Infector:Another common problem of the computer programmers is the file infector
viruses which automatically interrupt during the processing or while writing and infects the
file. Or they work on execution of the file. Unwanted dialog boxes starts appearing on the
screen with unknown statements with extensions .com and .exe. They destroy the original
copy of the file and save the infected file with the same as original. Once infected, it is very
hard to recover the original data.

6)Stealth viruses: Stealth viruses have the capability to hide from operating system or anti-
virus software by making changes to file sizes or directory structure. Stealth viruses are
anti-heuristic nature which helps them to hide from heuristic detection.

7)Resident Virus:These are the threat programs that permanently penetrates in the
Random access memory of the computer system .when the computer gets started it is
automatically transmitted to the secondary storage media and interrupts all the sequential
operations of the processor and corrupt all the running programs. For instance Randex
and CMJ are commonly known resident viruses .if these viruses gets into the hard disk
then one has to replace the secondary storage media and some times RAM even.

8)Polymorphic Viruses: Polymorphic viruses change their form in order to avoid detection
and disinfection by anti-virus applications. After the work, these types of viruses try to hide
from the anti-virus application by encrypting parts of the virus itself. This is known as
mutation.

9)Retrovirus is another type virus which tries to attack and disable the anti-virus
application running on the computer. A retrovirus can be considered anti-antivirus. Some
Retroviruses attack the anti-virus application and stop it from running or some other
destroys the virus definition database.

Worms:
A computer worm is a self-replicating computer program that penetrates an operating
system with the intent of spreading malicious code. Worms utilize networks to send copies
of the original code to other computers, causing harm by consuming bandwidth or possibly
deleting files or sending documents via email. Worms can also install backdoors on
computers. Worms are often confused with computer viruses; the difference lies in how
they spread. Computer worms self-replicate and spread across networks, exploiting
vulnerabilities, automatically; that is, they dont need a cyber criminals guidance, nor do
they need to latch onto another computer program.

A mail worm is carried by an email message, usually as an attachment but there have
been some cases where the worm is located in the message body. The recipient must
open or execute the attachment before the worm can activate. The
attachment may be a document with the worm attached in a virus-like manner, or it may
bean independent file. The worm may very well remain undetected by the user if it is
attached to a document. The document is opened normally and the users attention is
probably focused on the document contents when the worm activates. Independent worm
files usually fake an error message or perform some similar action to avoid detection.
Pure worms have the potential to spread very quickly because they are not dependent on
any human actions, but the current networking environment is not ideal for them. They
usually require a direct real-time connection between the source and target computer
when the worm replicates.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Trojan Virus:
A trojan in computing is malicious code hidden within software or data that is designed to
compromise security, execute disruptive or damaging commands, or allow improper
access to computers, networks and electronic systems.
Trojans are similar to worms and viruses, but trojans do not replicate themselves or seek
to infect other systems once installed on a computer.As software programs, Trojan horses
can appear as a game, a mobile application, a utility program, or a textual hyperlink. Each
intends to enhance interest and to entice an unsuspecting user to download the disguised
malware or virus. Once downloaded and installed, the infection is free to collect personal
information, destroy files and records, and eventually render your computer or network
unusable.Cybercriminals purposely create malware and virus packages with the intention
of either obtaining personal information or destroying computer records and files. By hiding
the malicious code and making it appear innocent, many individuals will overlook the
possibility of a Trojan horse and download the package without thinking.

Classification of Trojan Horse Virus:

Backdoor: These are created to give an unauthorized user remote control of a computer.
Once installed on a machine, the remote user can then do anything they wish with the
infected computer. This often results in uniting multiple backdoor Trojan-infected
computers working together for criminal activity.

Rootkit: Programmed to conceal files and computer activities, rootkits are often created to
hide further malware from being discovered. Normally, this is so malicious programs can
run for an extended period of time on the infected computer.

DDoS: A sub sect of backdoor Trojans, denial of service (DoS) attacks are made from
numerous computers to cause a web address to fail.

Banker: Trojan-bankers are created for the sole purpose of gathering users bank, credit
card, debit card and e-payment information.

FakeAV: This type of Trojan is used to convince users that their computers are infected
with numerous viruses and other threats in an attempt to extort money. Often, the threats
arent real, and the FakeAV program itself will be what is causing problems in the first
place.

Ransom: Trojan-Ransoms will modify or block data on a computer either so it doesnt work
properly or so certain files cant be accessed. The person disrupting the computer will
restore the computer or files only after a user has paid a ransom. Data blocked this way is
often impossible to recover without the criminals approval.
1)SAML (Security Assertion Markup Language) is an open standard for exchanging
authentication information between a service provider and an identity provider (IdP). A
third-party IdP is used to authenticate users and to pass identity information to the service
provider in the form of a digitally signed XML(Extensible Mark-up language)
document. Tableau Server is a service provider. Examples of IdPs include PingOne and
OneLogin.SAML is designed for business-to-business (B2B) and business-to-consumer
(B2C) transactions.

Single sign-on (SSO) is a session and user authentication service that permits a user to
use one set of login credentials (e.g., name and password) to access multiple applications.
The service authenticates the end user for all the applications the user has been given
rights to and eliminates further prompts when the user switches applications during the
same session. On the back end, SSO is helpful for logging user activities as well as
monitoring user accounts.Some SSO services use protocols such as Kerberos and the
security assertion markup language (SAML).

The three main components of the SAML protocol:

Assertions Most common are the following 2 SAML assertions:

o Authentication assertions are used to make people prove their identities.
o Attribute assertions are used to generate specific information about the
person, for example their phone number or email address.

Protocol This defines the way that SAML asks for and gets assertions, for
example, using SOAP over HTTP.
Binding This details exactly how SAML message exchanges are mapped into
SOAP exchanges.

Protocol defines how SAML asks for and receives assertions. Binding defines how
SAML message exchanges are mapped to Simple Object Access Protocol (SOAP)
exchanges. SAML works with multiple protocols including Hypertext Transfer
Protocol (HTTP), Simple Mail Transfer Protocol (SMTP), File Transfer Protocol (FTP)
and also supports SOAP, BizTalk, and Electronic Business XML (ebXML). The
Organization for the Advancement of Structured Information Standards (OASIS) is
the standards group for SAML.

2)OAuth 2

OAuth, which was first released in 2007, was conceived as an authentication method for
the Twitter application program interface (API). In 2010, The IETF OAuth Working Group
published OAuth 2.0. Like the original OAuth, OAuth 2.0 provides users with the ability to
grant third-party access to web resources without sharing a password. Updated features
available in OAuth 2.0 include new flows, simplified signatures and short-lived tokens with
long-lived authorizations.OAuth 2 is an authorization framework that enables applications
to obtain limited access to user accounts on an HTTP service, such as Facebook, GitHub,
and DigitalOcean. It works by delegating user authentication to the service that hosts the
user account, and authorizing third-party applications to access the user account. OAuth 2
provides authorization flows for web and desktop applications, and mobile devices.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
OAuth defines four roles:

Resource owner (the User) - An entity capable of granting access to a protected

resource. When the resource owner is a person, it is referred to as an end-user.
Resource server (the API server) - The server hosting the protected resources,
capable of accepting and responding to protected resource requests using access
tokens.
Client - An application making protected resource requests on behalf of the
resource owner and with its authorization. The term client does not imply any
particular implementation characteristics (e.g. whether the application executes on
a server, a desktop, or other devices).
Authorization server - The server issuing access tokens to the client after
successfully authenticating the resource owner and obtaining authorization.

OpenID Connect is an open standard published in early 2014 that defines an

interoperable way to use OAuth 2.0 to perform user authentication. In essence, it is a
widely published recipe for chocolate fudge that has been tried and tested by a wide
number and variety of experts. Instead of building a different protocol to each potential
identity provider, an application can speak one protocol to as many providers as they want
to work with. Since it's an open standard, OpenID Connect can be implemented by anyone
without restriction or intellectual property concerns.

OpenID Connect is built directly on OAuth 2.0 and in most cases is deployed right along
with (or on top of) an OAuth infrastructure. OpenID Connect also uses the JSON Object
Signing And Encryption (JOSE) suite of specifications for carrying signed and encrypted
information around in different places. In fact, an OAuth 2.0 deployment with JOSE
capabilities is already a long way to defining a fully compliant OpenID Connect system,
and the delta between the two is relatively small.

Firewall

A firewall is a network security device that monitors incoming and outgoing network
traffic and decides whether to allow or block specific traffic based on a defined set of
security rules.

Firewalls have been a first line of defense in network security for over 25 years. They
establish a barrier between secured and controlled internal networks that can be
trusted and untrusted outside networks, such as the Internet.

A firewall can be hardware, software, or both.

Access Control Lists

Early on, the firewall function was initially performed by Access Control Lists (ACLs),
often on routers. ACLs are essentially rules written out that determine whether network
access should be granted or rejected to specific IP addresses. For example, an ACL
can have a line that states all traffic from IP 172.168.2.2 must be rejected, or to allow
all traffic on port 80 from 172.168.2.2 to the web server at 10.10.10.201.ACLs are
advantageous due to scalability and high-performance, but cannot read past packet
headers, which provides only rudimentary information about the traffic. Thus, ACL
packet filtering alone does not have the capacity to keep threats out of the network.
Types of Firewall:

1. Packet-filtering firewalls operate at the router and compare each packet

received to a set of established criteria (such as allowed IP addresses,
packet type, port number, etc.) before being either dropped or forwarded.
2. Circuit-level gateways monitor the TCP handshaking going on between
the local and remote hosts to determine whether the session being
initiated is legitimate -- whether the remote system is considered "trusted."
They don't inspect the packets themselves, however.

3. Stateful inspection firewalls, on the other hand, not only examine each
packet, but also keep track of whether or not that packet is part of an
established TCP session. This offers more security than either packet
filtering or circuit monitoring alone, but exacts a greater toll on network
performance.

4. Application-level gateways (proxies) combine some of the attributes of

packet-filtering firewalls with those of circuit-level gateways. They filter
packets not only according to the service for which they are intended (as
specified by the destination port), but also by certain other characteristics
such as HTTP request string. While application-level gateways provide
considerable data security, they can dramatically impact network
performance.

5. Multilayer inspection firewalls combine packet filtering with circuit

monitoring, while still enabling direct connections between the local and
remote hosts, which are transparent to the network. They accomplish this
by relying on algorithms to recognize which service is being requested,
rather than by simply providing a proxy for each protected
service. Multilayer firewalls work by retaining the status (state) assigned
to a packet by each firewall component through which it passes on the
way up the protocol stack. This gives the user maximum control over
which packets are allowed to reach their final destination, but again
affects network performance, although generally not so dramatically as
proxies do.

While inspection firewalls are the most secure, they are also rather complex
and the most likely to be misconfigured. Whichever firewall type you choose,
keep in mind that a misconfigured firewall can in some ways be worse than
no firewall at all, because it lends the dangerous impression of security while
providing little or none.

Digital Signature: Signature is the proof to the receiver that the document comes
from the correct entity. The person who signs it takes the responsibility of the content
present in the document. A signature on a document, when verified, is a sign of
authentication; the document is authentic.

Digital signatures are based on public key cryptography, also known

as asymmetric cryptography. Using a public key algorithm such as RSA, one
can generate two keys that are mathematically linked: one private and one
public. To create a digital signature, signing software (such as an email
program) creates a one-way hash of the electronic data to be signed.
The private key is then used to encrypt the hash. The encrypted hash --
along with other information, such as the hashing algorithm -- is the digital
signature. The reason for encrypting the hash instead of the entire message
or document is that a hash function can convert an arbitrary input into a fixed
length value, which is usually much shorter. This saves time since hashing is
much faster than signing.

The value of the hash is unique to the hashed data. Any change in the data,
even changing or deleting a single character, results in a different value. This
attribute enables others to validate the integrity of the data by using the
signer's public key to decrypt the hash. If the decrypted hash matches a
second computed hash of the same data, it proves that the data hasn't
changed since it was signed. If the two hashes don't match, the data has
either been tampered with in some way (integrity) or the signature was
created with a private key that doesn't correspond to the public key
presented by the signer (authentication).A digital signature can be used with
any kind of message -- whether it is encrypted or not -- simply so the receiver
can be sure of the sender's identity and that the message arrived intact.
Digital signatures make it difficult for the signer to deny having signed
something (non-repudiation) -- assuming their private key has not been
compromised -- as the digital signature is unique to both the document and
the signer, and it binds them together. A digital certificate, an electronic
document that contains the digital signature of the certificate-issuing
authority, binds together a public key with an identity and can be used to
verify a public key belongs to a particular person or entity.

Hacking: In computer networking, hacking is any technical effort to manipulate the

normal behavior of network connections and connected systems. A hacker is any
person engaged in hacking. The term "hacking" historically referred to constructive,
clever technical work that was not necessarily related to computer systems. Today,
however, hacking and hackers are most commonly associated with malicious
programming attacks on the Internet and other networks.

Hackers are classified according to the intent of their actions. The following list
classifies hackers according to their intent.

Symbol Description

Ethical Hacker (White hat): A hacker who gains access

to systems with a view to fix the identified weaknesses.
They may also perform penetration Testing and
vulnerability assessments.

Cracker (Black hat): A hacker who gains unauthorized

access to computer systems for personal gain. The intent
is usually to steal corporate data, violate privacy rights,
transfer funds from bank accounts etc.
Grey hat: A hacker who is in between ethical and black
hat hackers. He/she breaks into computer systems withou
authority with a view to identify weaknesses and reveal
them to the system owner.

Script kiddies: A non-skilled person who gains access to

computer systems using already made tools.

Hacktivist: A hacker who use hacking to send social,

religious, and political, etc. messages. This is usually done
by hijacking websites and leaving the message on the
hijacked website.

Phreaker: A hacker who identifies and exploits

weaknesses in telephones instead of computers.

What is Cybercrime:
Cybercrime is the use of computers and networks to perform illegal activities such
as spreading computer viruses, online bullying, performing unauthorized electronic
fund transfers, etc. Most cybercrimes are committed through the internet. Some
cybercrimes can also be carried out using Mobile phones via SMS and online
chatting applications.

Type of Cybercrime
The following list presents the common types of cybercrimes:
Computer Fraud: Intentional deception for personal gain via the use of
computer systems.
Privacy violation: Exposing personal information such as email addresses,
phone number, account details, etc. on social media, websites, etc.
Identity Theft: Stealing personal information from somebody and
impersonating that person.
Sharing copyrighted files/information: This involves distributing copyright
protected files such as eBooks and computer programs etc.
Electronic funds transfer: This involves gaining an un-authorized access to
bank computer networks and making illegal fund transfers.
Electronic money laundering: This involves the use of the computer to
launder money.
ATM Fraud: This involves intercepting ATM card details such as account
number and PIN numbers. These details are then used to withdraw funds
from the intercepted accounts.
Denial of Service Attacks: This involves the use of computers in multiple
locations to attack servers with a view of shutting them down.
Spam: Sending unauthorized emails. These emails usually contain
advertisements.

The most common methods used by intruders to gain control of home

computers are briefly described below.

1. Trojan horse programs:Trojan horse programs are a common way for

intruders to trick you (sometimes referred to as "social engineering")
into installing "back door" programs. These can allow intruders easy
access to your computer without your knowledge, change your system
configurations, or infect your computer with a computer virus.
2. Back door and remote administration programs: On Windows
computers, three tools commonly used by intruders to gain remote
access to your computer are BackOrifice, Netbus, and SubSeven.
These back door or remote administration programs, once installed,
allow other people to access and control your computer.
3. Denial of service: Another form of attack is called a denial-of-service
(DoS) attack. This type of attack causes your computer to crash or to
become so busy processing data that you are unable to use it. It is
important to note that in addition to being the target of a DoS attack, it
is possible for your computer to be used as a participant in a denial-of-
service attack on another system.
4. Being an intermediary for another attack: Intruders will frequently
use compromised computers as launching pads for attacking other
systems. An example of this is how distributed denial-of-service
(DDoS) tools are used. The intruders install an "agent" (frequently
through a Trojan horse program) that runs on the compromised
computer awaiting further instructions. Then, when a number of agents
are running on different computers, a single "handler" can instruct all of
them to launch a denial-of-service attack on another system. Thus, the
end target of the attack is not your own computer, but someone elses -
- your computer is just a convenient tool in a larger attack.
5. Unprotected Windows shares: Unprotected Windows networking
shares can be exploited by intruders in an automated way to place
tools on large numbers of Windows-based computers attached to the
Internet. Because site security on the Internet is interdependent, a
compromised computer not only creates problems for the computer's
owner, but it is also a threat to other sites on the Internet. The greater
immediate risk to the Internet community is the potentially large
number of computers attached to the Internet with unprotected
Windows networking shares combined with distributed attack tools.
Another threat includes malicious and destructive code, such as
viruses or worms, which leverage unprotected Windows networking
shares to propagate.
There is great potential for the emergence of other intruder tools that
leverage unprotected Windows networking shares on a widespread
basis.
6. Mobile code (Java/JavaScript/ActiveX):There have been reports of
problems with "mobile code" (e.g. Java, JavaScript, and ActiveX).
These are programming languages that let web developers write code
that is executed by your web browser. Although the code is generally
useful, it can be used by intruders to gather information (such as which
web sites you visit) or to run malicious code on your computer. It is
possible to disable Java, JavaScript, and ActiveX in your web browser.
7. Cross-site scripting: A malicious web developer may attach a script
to something sent to a web site, such as a URL, an element in a form,
or a database inquiry. Later, when the web site responds to you, the
malicious script is transferred to your browser.
You can potentially expose your web browser to malicious scripts by
following links in web pages, email messages, or newsgroup postings
without knowing what they link to
using interactive forms on an untrustworthy site
viewing online discussion groups, forums, or other dynamically
generated pages where users can post text containing HTML tags
8. Packet sniffing: A packet sniffer is a program that captures data from
information packets as they travel over the network. That data may
include user names, passwords, and proprietary information that
travels over the network in clear text. With perhaps hundreds or
thousands of passwords captured by the packet sniffer, intruders can
launch widespread attacks on systems. Installing a packet sniffer does
not necessarily require administrator-level access.
Relative to DSL and traditional dial-up users, cable modem users have
a higher risk of exposure to packet sniffers since entire neighborhoods
of cable modem users are effectively part of the same LAN. A packet
sniffer installed on any cable modem user's computer in a
neighborhood may be able to capture data transmitted by any other
cable modem in the same neighborhood.

9) Spoofing: Hackers may alter an email header to make in appear that a

request for information orginated from another address.This is called spoofing.They
can gain electronic entry by pretending to be at a legitimate computer.

Some Following Terms of Network Security:

1)Web Bugs: A Web bug is a small GIF format image file that cane
embedded in a Web page or an HTML format email message. A Web Bug
can be a small as single pixel in size and can easily be hidden anywhere in
an HTML document.

2)Spyware: The term spyware is used to refer to many different kinds of

software that can track a computer users activities and report them to
someone else. There are now countless varieties of spyware program.
Another common term for spyware is adware, because Internet advertising is
common source of spyware.Spyware can record individual keywords, web
pages, email address, personal information and other types of data.This
means that any number of companies can be using spyware to track your
online activities. For this reason, anti-spyware software development has
exploded,with dozens of spyware killing products on the market.

3)Cookies: A cookie is a small text file that a Web servers asks your browser
to place on your computer. The cookie contains informations that identifies
your computer(its ip address),you (your user-name and email address),and
information about your visit to the Website. If you set up an account at a Web
Site such as an e-commerce site, the cookie will contain information about
your account, making it easy for the server to find and manage your account
whenever you visit.

4)Snagging: In the right setting, a thief can try snagging information by listening in on
a telephone extension, through a wiretap, or over a cubicle wall while the victim gives
credit card other personal information to a legitimate agent

5)Flooders: Used to attack networked computer systems with a large volume of traffic
to carry out a denial of service(Dos) attack.

6)Rootkit: Set of hacker tools used after attacker has broken into computer system
and gained root-level access.

7)Zombie: Program activated on an infected machine that is activated to launch

attacks on other machines.

Intrusion:An intrusion is defined as the unauthorized use, misuse, or abuse of

computer systems by either authorized users or external perpetrators.

Types of Intrusions:
External attacks
attempted break-ins, denial of service attacks, etc.

Internal attacks
Masquerading as some other user

Misuse of privileges, malicious attacks

Clandestine users: exploiting bugs in privileged programs

Types of intruders:
Masquerader : pretend to be someone one is not
An individual who is not authorized to use the computer and who penetrates a
systems access controls to exploit a legitimate users account
Misfeasor : authentic user doing unauthorized actions
A legitimate user who accesses data, programs, or resources for which such access is
not authorized, or who is authorized for such access but misuses his or her privileges
Clandestine user : done secretively, especially because illicit
An individual who seizes supervisory control of the system and uses this control to
evade auditing and access controls or to suppress audit collection.

Mechanisms Used:

Prevention: isolate from network, strict authentication measures, encryption

Preemption:
do unto others before they do unto you
Deterrence: dire warnings,
we have a bomb too.
Deflection: diversionary techniques to lure away
Detection
Counter attacks

The two principal counter-measures:

Detection : is concerned with learning of an attack, either before or after its

success.
Prevention : is a challenging security goal. The difficulty stems from the fact
that the defender must attempt to thwart all possible attacks, whereas the
attacker is free to try to find the weakest link in the defense chain and attack at
that point.

An Intrusion Detection System (IDS) is a system that attempts to identify intrusions.

Intrusion detection is the process of identifying and responding to malicious activity
targeted at computing and networking resources.

Examples :

Car Alarms
House Alarms
Surveillance Systems
Spy Satellites, and spy planes
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

DATA MINING AND DATA WAREHOUSING

1)Data Mining: Data mining, the extraction of hidden predictive information from large
databases, is a powerful new technology with great potential to help companies focus on
the most important information in their data warehouses. Data mining tools predict future
trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions.
The automated, prospective analyses offered by data mining move beyond the analyses of
past events provided by retrospective tools typical of decision support systems.

The most commonly used techniques in data mining are:

Artificial neural networks: Non-linear predictive models that learn through training
and resemble biological neural networks in structure.

Decision trees: Tree-shaped structures that represent sets of decisions. These

decisions generate rules for the classification of a dataset. Specific decision tree
methods include Classification and Regression Trees (CART) and Chi Square
Automatic Interaction Detection (CHAID) .

Genetic algorithms: Optimization techniques that use processes such as genetic

combination, mutation, and natural selection in a design based on the concepts of
evolution.

Nearest neighbor method: A technique that classifies each record in a dataset

based on a combination of the classes of the k record(s) most similar to it in a
historical dataset (where k 1). Sometimes called the k-nearest neighbor technique.

Rule induction: The extraction of useful if-then rules from data based on statistical
significance.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Data Mining Architecture:

2)Data Warehouse:

A data warehouse is a:

subject-oriented
integrated
time varying
non-volatile
collection of data in support of the management's decision-making process.A data
warehouse is a centralized repository that stores data from multiple information sources
and transforms them into a common, multidimensional data model for efficient querying
and analysis.

Subject Oriented:Data warehouses are designed to help you analyze data. For example,
to learn more about your company's sales data, you can build a warehouse that
concentrates on sales. Using this warehouse, you can answer questions like "Who was
our best customer for this item last year?" This ability to define a data warehouse by
subject matter, sales in this case, makes the data warehouse subject oriented.

Integrated:Integration is closely related to subject orientation. Data warehouses must put

data from disparate sources into a consistent format. They must resolve such problems as
naming conflicts and inconsistencies among units of measure. When they achieve this,
they are said to be integrated.

Nonvolatile:Nonvolatile means that, once entered into the warehouse, data should not
change. This is logical because the purpose of a warehouse is to enable you to analyze
what has occurred.

Time Variant:In order to discover trends in business, analysts need large amounts of data.
This is very much in contrast to online transaction processing (OLTP) systems, where
performance requirements demand that historical data be moved to an archive. A data
warehouse's focus on change over time is what is meant by the term time variant.

There are two approaches to data warehousing, top down and bottom up. The top down
approach spins off data marts for specific groups of users after the complete data
warehouse has been created. The bottom up approach builds the data marts first and then
combines them into a single, all-encompassing data warehouse.

Slice and dice refers to a strategy for segmenting, viewing and understanding data in a
database. Users slices and dice by cutting a large segment of data into smaller parts, and
repeating this process until arriving at the right level of detail for analysis. Slicing and
dicing helps provide a closer view of data for analysis and presents data in new and
diverse perspectives.The term is typically used with OLAP databases that present
information to the user in the form of multidimensional cubes similar to a 3D spreadsheet.
ETL process:

ETL (Extract, Transform and Load) is a process in data warehousing responsible for
pulling data out of the source systems and placing it into a data warehouse. ETL involves
the following tasks:

Extracting the data from source systems (SAP, ERP, other operational systems), data
from different source systems is converted into one consolidated data warehouse format
which is ready for transformation processing.

Transforming the data may involve the following tasks:

applying business rules (so-called derivations, e.g., calculating new measures and
dimensions),
cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.),
filtering (e.g., selecting only certain columns to load),
splitting a column into multiple columns and vice versa,
joining together data from multiple sources (e.g., lookup, merge),
transposing rows and columns,
applying any kind of simple or complex data validation (e.g., if the first 3 columns
in a row are empty then reject the row from processing)

Loading the data into a data warehouse or data repository other reporting applications

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Types of OLAP:
MOLAP Multidimensional OnLine Analytical Processes
MOLAP is the more traditional OLAP type. In MOLAP, both the source data and the
aggregation calculations are stored in a multidimensional format. This type is the
fastest option for data retrieval, but it also requires the most storage space. MOLAP
systems are more optimized for fast query performance and retrieval of summarized
data. The limitations in MOLAP are that it is not very scalable and can only handle
limited amounts of data since calculations are predefined in the cube.

ROLAP Relational OnLine Analytical Processes

ROLAP stores all data, including aggregations, in the source relational database. This
type of storage is good for enterprises that need larger data warehousing. ROLAP
uses an SQL reporting tool to query data directly from the data warehouse. ROLAPs
advantages include better scalability, enabling it to handle huge amounts of data, and
the ability to efficiently manage both numeric and textual data.

HOLAP Hybrid OnLine Analytical Processes (Combination of MOLAP &

ROLAP)
HOLAP attempts to combine the best features of MOLAP and ROLAP in a single
system. HOLAP systems store larger amounts of data in relational tables, and
aggregations are stored in the pre-calculated cubes, offering better scalability, quick
data processing and flexibility in accessing data sources.

RTOLAP Real Time OnLine Analytical Processes

RTOLAP systems store data in-memory, enabling real time analysis of data as
required. Real time OLAP systems do not store pre-calculated values, avoiding data
explosion as less data is stored. Data updates are immediate, queries are performed
on-demand, and results are immediately available as well. Our ActivePivot technology
is an example of a production-ready RTOLAP solution. You might want to check our
separate post on RTOLAP technology.

WOLAP Web-based OnLine Analytical Processes

WOLAP is an OLAP application accessible through a web browser. This type of
application offers lower investment and enhanced accessibility, as there are lower
deployment costs and all a user needs is an Internet connection and web browser, but
it is limited in its performance. The ActivePivot solution provides web-access OLAP by
allowing users to access their OLAP data using a three-tiered architecture that consists
of the user, ActivePivot software, and a database server.

DOLAP Desktop OnLine Analytical Processes

(also LOLAP Local OnLine Analytical Processes)
DOLAP is based on the idea that a user can download a data cube and work with it
locally. This type of application is easy to deploy and has lower costs, but it is very
limited in its performance. With ActivePivot, Excel sheets can be saved to the users
desktop, for local access to OLAP data.

MOLAP Mobile OnLine Analytical Processes

(also ROLAP Remote OnLine Analytical Processes)
Mobile OLAP refers to functionalities of OLAP solutions that are accessible through a
wireless or mobile device, enabling the user to access OLAP data and functionalities
remotely via the users mobile device. ActivePivot provides mobile OLAP accesses
through its ActiveUI solution, which allows accessing data and certain capabilities
through your smartphone, iPad, or any other mobile device.

SOLAP Spatial OnLine Analytical Processes

SOLAP was created based on the idea to integrate GIS and OLAP capabilities in one
system to handle both spatial and non-spatial data.

A Data Mart is one piece of a data warehouse where all the information is related to
a specific business area. Therefore it is considered a subset of all the data stored in
that particular database, since all data marts together create a data warehouse.
This idea of subsetting the information can be easily extrapolated to different
departments in a company or distinct business areas with lots of data related to it.
They are all related to the same company but divided by usability into several data
marts.
So a data mart is some subset of data specific to some user types tasks, creating a
view in a format that makes information easier to use and analyse by the end users of
your system.

Why do we need a Data mart?

Easy access to frequently needed data.

Helps speed up analytical queries by reducing the volume of data to be
scanned, thus improving end-user response time.
Gives structure to the data, making it suitable for a user access tool.
To partition data in order to impose access control strategies.
To segment data into different hardware platform.
Creates collective view by a group of users.
Contains only business essential data and is less cluttered.

Data Lake: Data warehousing applies the structure on the way in, organizing it to
fit the context of the database schema. Data lakes facilitate a much more fluid
approach; they only add structures to data as it dispenses to the application layer.
In storage, data lakes preserve the original structures or unstructured forms to
remain; it is a Big Data storage and retrieval system that could conceivably scale
upward indefinitely. Data lake is often associated with Hadoop-oriented object
storage. In such a scenario, an organization's data is first loaded into the Hadoop
platform, and then business analytics and data mining tools are applied to the data
where it resides on Hadoop's cluster nodes of commodity computers. Microsoft
Azure Data Lake is a highly scalable data storage and analytics service. The
service is hosted in Azure, Microsoft's public cloud, and is largely intended for big
datastorage and analysis.

Data SWAMP: Data lakes do not require much structure, and they accept all data.
However, in poorly designed and neglected systems, they risk becoming data swamps.
A Data Swamp is the term that describes the failure to document the stored data
accurately, resulting in the inability to analyze and exploit the data efficiently; the
original data may remain, but the data swamp cannot retrieve it without the metadata
that gives it context.

Data Cube: A Data Cube is an application that puts data into matrices of three or more
dimensions. Transformations in the data express as tables, arrays of processed
information. Where tables match rows of data strings with columns of data types,
a data cube cross-references tables from single or multiple data sources to increase
the detail associated with each data point. This transformation connects the data to a
position in rows and columns of more than one table. The benefit is that knowledge
workers can use data cubes to create data volumes to drill down into and discover the
deepest insights possible.

Clustering in Data Mining: Clustering is the grouping of a particular set of objects

based on their characteristics, aggregating them according to their similarities.
Regarding to data mining, this metodology partitions the data implementing a specific
join algorithm, most suitable for the desired information analysis.This clustering
analysis allows an object not to be part of a cluster, or strictly belong to it, calling this
type of grouping hard partitioning. In the other hand, soft partitioning states that every
object belongs to a cluster in a determined degree. More specific divisions can be
possible to create like objects belonging to multiple clusters, to force an object to
participate in only one cluster or even construct hierarchical trees on group
relationships.

Clustering methods can be classified into the following categories

Partitioning Method
Hierarchical Method
Density-based Method
Grid-Based Method
Model-Based Method
Constraint-based Method

Application of Data Mining:

Data Mining Applications in Sales/Marketing

Data mining enables businesses to understand the hidden patterns inside historical
purchasing transaction data, thus helping in planning and launching new marketing
campaigns in prompt and cost effective way. The following illustrates several data
mining applications in sale and marketing.

Data mining is used for market basket analysis to provide information on what
product combinations were purchased together when they were bought and in
what sequence. This information helps businesses promote their most
profitable products and maximize the profit. In addition, it encourages customers
to purchase related products that they may have been missed or overlooked.
Retail companies use data mining to identify customers behavior buying
patterns.

Data Mining Applications in Banking / Finance

Several data mining techniques e.g., distributed data mining have been
researched, modeled and developed to help credit card fraud detection.
Data mining is used to identify customers loyalty by analyzing the data of
customers purchasing activities such as the data of frequency of purchase in a
period of time, a total monetary value of all purchases and when was the last
purchase. After analyzing those dimensions, the relative measure is generated
for each customer. The higher of the score, the more relative loyal the customer
is.
To help the bank to retain credit card customers, data mining is applied. By
analyzing the past data, data mining can help banks predict customers that
likely to change their credit card affiliation so they can plan and launch different
special offers to retain those customers.
Credit card spending by customer groups can be identified by using data
mining.
The hidden correlations between different financial indicators can be
discovered by using data mining.
From historical market data, data mining enables to identify stock trading rules.

Data Mining Applications in Health Care and Insurance

The growth of the insurance industry entirely depends on the ability to convert data into
the knowledge, information or intelligence about customers, competitors, and its
markets. Data mining is applied in insurance industry lately but brought tremendous
competitive advantages to the companies who have implemented it successfully. The
data mining applications in insurance industry are listed below:

Data mining is applied in claims analysis such as identifying which medical

procedures are claimed together.
Data mining enables to forecasts which customers will potentially purchase new
policies.
Data mining allows insurance companies to detect risky customers behavior
patterns.
Data mining helps detect fraudulent behavior.

Data Mining Applications in Transportation

Data mining helps determine the distribution schedules among warehouses and
outlets and analyze loading patterns.

Data Mining Applications in Medicine

Data mining enables to characterize patient activities to see incoming office

visits.
Data mining helps identify the patterns of successful medical therapies for
different illnesses.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
SOFTWARE ENGINEERING
Software Engineering is an engineering approach for software development.The
basic principle of software engineering is to use structured, formal and disciplined
methods for building and using systems.The outcome of software engineering is an
efficient and reliable software product.

Without using software engineering principles it would be difficult to develop large

programs. In industry it is usually needed to develop large programs to accommodate
multiple functions. A problem with developing such large commercial programs is that
the complexity and
difficulty levels of the programs increase exponentially with their sizes. Software
engineering helps to reduce this programming complexity. Software engineering p
rinciples use two important techniques to reduce problem complexity: abstraction
and decomposition. The principle of abstraction implies that a problem can be s
implified by omitting irrelevant details. In other words, the main purpose of
abstraction is to consider only those aspects of the problem that
are relevant for certain purpose and suppress other aspects that are not relevan
t for the given purpose. Once the simpler problem is solved, then the omitted d
etails can be taken into consideration to solve the next lower level abstraction, and
so on. Abstraction is a powerful way of reducing the complexity of the problem.
The other approach to tackle problem complexity
is decomposition. In this technique, a complex problem is divided into
several smaller problems and then the smaller problems are solved one by one.
However, in this technique any random decomposition of a problem into smaller parts
will not help. The problem has to be decomposed such that each component of the
decomposed problem can be solved independently and then
the solution of the different components can be combined to get the full solution
. A good decomposition of a problem should minimize interactions among variou
s components.

System Requirement Specification(SRS):

It is obtained after excessive discussions with the users.Software requirement
specification (SRS) is a document that completely describes what the proposed
software should do without describing how software will do it.SRS is important and
difficult task of a System Analyst.

Characteristics of SRS:

Correct
Complete and Unambiguous
Verifiable
Consistent
Traceable
Modifiable
Software Life Cycle Models:
A software life cycle model (also called process model) is a descriptive and dia
grammatic representation of the software life cycle. A life cycle model represents all
the activities
required to make a software product transit through its life cycle phases. It also
captures the order in which these activities are to be undertaken. In other words, a
life cycle model maps the different activities performed on a software product from its
inception to retirement. Different life cycle models may map the basic development
activities to phases in different ways. Thus, no
matter which life cycle model is followed, the basic activities are included in all
life cycle models though the activities may be carried out in different orders in
different life cycle models. During any life cycle phase, more than one activity may also
be carried out. A software life cycle model is a particular abstraction representing a
software life cycle.Such a model may be:

Activity-centered----Focusing on the activities of software development

Entity-centered----Focusing on the work products created by these activities

A software life cycle model is often referred to as a Software Development Life

Cycle(SDLC).ISO/IEC 12207 is an international standard for software life-cycle
processes. It aims to be the standard that defines all the tasks required for developing
and maintaining software.

Waterfall Model:
The Waterfall Model was first Process Model to be introduced.
The Waterfall Model is a linear sequential flow. In which progress is seen as flowing
steadily downwards (like a waterfall) through the phases of software implementation.
This means that any phase in the development process begins only if the previous
phase is complete. The waterfall approach does not define the process to go back to
the previous phase to handle changes in requirement. The waterfall approach is the
earliest approach that was used for software development.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Requirement Gathering and Analysis:Capture all the possible requirement of
the system to be developed and documented in a software requirement.
System Design:Helps in specifying hardware and system requirements and
also helps in defining overall system architecture.
Implementation:With inputs from system design, the system is first developed
in small programs called units, which are integrated in the next phase. Each unit
is developed and tested for its functionality which is referred to as Unit Testing.
Integration and Testing:All the units developed in the implementation phase
are integrated into a system after testing of each unit. During this phase, each
module is unit tested to determine the correct working of all the individual
modules. It involves testing each module in isolation as this is the most efficient
way to debug the errors identified at this stage.
Integration and System
Testing: During the integration and system testing phase, the modules ar
e integrated in a planned manner. The different modules making up a software
product are almost never integrated in one shot. Integration is normally carried
out incrementally over a number of
steps. During each integration step, the partially integrated system is teste
d and a set of previously planned modules are added to it. Finally, when all
the modules have been successfully integrated and tested, system testing is
carried out. The goal of system testing is to ensure that the developed system
conforms to its requirements laid out in the SRS document.System testing
usually consists of three different kinds of testing activities: testing: It is the
system testing performed by the development team.

testing: It is the system testing performed by a friendly set of customers.

Acceptance testing: It is the system testing performed by the customer himself

after the product delivery to determine whether to accept or reject the delivered
product.

Deployment of System:Once the functional and non functional testing is done,

the product is deployed in the customer environment or released into the
market.
Maintenance: Maintenance of a typical software product requires much more
than the effort
necessary to develop the product itself. Many studies carried out in the p
ast confirm this and indicate that the relative effort of development of a typical
software product to its maintenance effort is roughly in the 40:60 ratios.
Maintenance involves performing any one or more of the following three kinds of
activities: Correcting errors that were not discovered during the product
development phase. This is called corrective
maintenance.Improving the implementation of the system, and enhancing t
he functionalities of the system according to the customers requirements.
This is called perfective maintenance.

Porting the software to work in a new environment. For example, porting

may be required to get the software to work on a new computer platform or
with a new operating system. This is called adaptive maintenance.

Advantages of waterfall model:

This model is simple and easy to understand and use.

It is easy to manage due to the rigidity of the model each phase has specific
deliverables and a review process.
In this model phases are processed and completed one at a time. Phases do
not overlap.
Waterfall model works well for smaller projects where requirements are very
well understood.

Disadvantages of waterfall model:

Once an application is in the testing stage, it is very difficult to go back and

change something that was not well-thought out in the concept stage.
No working software is produced until late during the life cycle.
High amounts of risk and uncertainty.
Not a good model for complex and object-oriented projects.
Poor model for long and ongoing projects.
Not suitable for the projects where requirements are at a moderate to high risk
of changing.

When to use the waterfall model:

This model is used only when the requirements are very well known, clear and
fixed.
Product definition is stable.
Technology is understood.
There are no ambiguous requirements
Ample resources with required expertise are available freely
The project is short.

Very less customer enter action is involved during the development of the product.
Once the product is ready then only it can be demoed to the end users. Once the
product is developed and if any failure occurs then the cost of fixing such issues are
very high, because we need to update everywhere from document till the logic.

RAD model is Rapid Application Development model. It is a type of incremental model. In

RAD model the components or functions are developed in parallel as if they were mini
projects. The developments are time boxed, delivered and then assembled into a working
prototype. This can quickly give the customer something to see and use and to provide
feedback regarding the delivery and their requirements.If the project is large, it is divided
into a series of smaller projects. Each of these smaller projects is planned and delivered
individually. Thus, with a series of smaller projects, the final project is delivered quickly and
in a less structured manner. The major characteristic of the RAD model is that it focuses
on the reuse of code, processes, templates, and tools.

Phases in RAD Model:

Business Modeling
Data Modeling
Process Modeling
Application Modeling
Testing and Turnover

1)Business Modeling: The business model for the product under development is designed
in terms of flow of information and the distribution of information between various business
channels. A complete business analysis is performed to find the vital information for
business, how it can be obtained, how and when is the information processed and what
are the factors driving successful flow of information.
2)Data Modeling: Once the business modeling phase over and all the business analysis
completed, all the required and necessary data based on business analysis are identified
in data modeling phase.

3)Process modeling: Data objects defined in data modeling are converted to achieve the
business information flow to achieve some specific business objective. Description are
identified and created for CRUD of data objects.

4)Application Generation: The actual system is built and coding is done by using
automation tools to convert process and data models into actual prototypes.

5)Testing and turnover: All the testing activates are performed to test the developed
application.

Advantages of RAD Model:

a)Fast application development and delivery.
b)Lest testing activity required.
c)Visualization of progress.
d)Less resources required.
e)Review by the client from the very beginning of development so very less chance to miss
the requirements.
f)Very flexible if any changes required.
g)Cost effective.
h)Good for small projects.

Disadvantages of RAD model:

a)Depends on strong team and individual performances for identifying business
requirements.
b)Only system that can be modularized can be built using RAD
c)Requires highly skilled developers/designers.
d)High dependency on modeling skills
e)Inapplicable to cheaper projects as cost of modeling and automated code generation is
very high.

When to use RAD model:

a)RAD should be used when there is a need to create a system that can be modularized in
2-3 months of time.
b)It should be used if theres high availability of designers for modeling and the budget is
high enough to afford their cost along with the cost of automated code generating tools.
c)RAD SDLC model should be chosen only if resources with high business knowledge are
available and there is a need to produce the system in a short span of time (2-3 months).
d)If technical risks are low.
e)If development needed to complete in specified time.
f)RAD Model is suitable if the functionality have less dependencies on other functionality.

Iterative Model: This model leads the software development process in iterations. It
projects the process of development in cyclic manner repeating every step after every
cycle of SDLC process. The software is first developed on very small scale and all the
steps are followed which are taken into consideration. Then, on every next iteration,
more features and modules are designed, coded, tested, and added to the software.
Every cycle produces a software, which is complete in itself and has more features and
capabilities than that of the previous one. After each iteration, the management team
can do work on risk management and prepare for the next iteration. Because a cycle
includes small portion of whole software process, it is easier to manage the
development process but it consumes more resources.

Advantages of Iterative model:

In iterative model we can only create a high-level design of the application before we
actually begin to build the product and define the design solution for the entire product.
Later on we can design and built a skeleton version of that, and then evolved the
design based on what had been built.
In iterative model we are building and improving the product step by step. Hence we
can track the defects at early stages. This avoids the downward flow of the defects.
In iterative model we can get the reliable user feedback. When presenting sketches
and blueprints of the product to users for their feedback, we are effectively asking them
to imagine how the product will work.
In iterative model less time is spent on documenting and more time is given for
designing.

Disadvantages of Iterative model:

Each phase of an iteration is rigid with no overlaps

Costly system architecture or design issues may arise because not all requirements
are gathered up front for the entire lifecycle

When to use iterative model:

Requirements of the complete system are clearly defined and understood.

When the project is big.
Major requirements must be defined; however, some details can evolve with time.
Spiral Model: Spiral model is a combination of both, iterative model and one of the
SDLC model. It can be seen as if you choose one SDLC model and combined it with
cyclic process (iterative model). The spiral model has four phases: Planning, Risk
Analysis, Engineering and Evaluation. A software project repeatedly passes through
these phases in iterations (called Spirals in this model). The baseline spiral, starting in
the planning phase, requirements are gathered and risk is assessed. Each subsequent
spirals builds on the baseline spiral. Its one of the software development models like
Waterfall, Agile, V-Model.

Advantages of Spiral model:

High amount of risk analysis hence, avoidance of Risk is enhanced.

Good for large and mission-critical projects.
Strong approval and documentation control.
Additional Functionality can be added at a later date.
Software is produced early in the software life cycle.

Disadvantages of Spiral model:

Can be a costly model to use.

Risk analysis requires highly specific expertise.
Projects success is highly dependent on the risk analysis phase.
Doesnt work well for smaller projects.

When to use Spiral model:

When costs and risk evaluation is important

For medium to high-risk projects
Long-term project commitment unwise because of potential changes to economic
priorities
Users are unsure of their needs
Requirements are complex
New product line
Significant changes are expected (research and exploration)

V Model or Verification and Validation Model. Every testing execution should follow
some sequence and V Model is the perfect way to perform the testing approaches. In
V Model there are some steps or sequences specified which should be followed during
performing test approach. Once one step completes we should move to the next step.
Test execution sequences are followed in V shape. In software development life cycle,
V Model testing should start at the beginning of the project when requirement analysis
starts. In V Model project development and testing should go parallel. Verification
phase should be carried out from SDLC where validation phase should be carried out
from STLC (Software Testing Life Cycle)

Steps in V Model
Basically there are 4 steps involved in STLC while performing V Model testing strategy.
Unit Testing.
Integration Testing.
System Testing.
Acceptance Testing.

Advantages of V Model
If project is small and easy to understand, V Model is the best approach as its
easy and simple to use.
Many testing activities are performed in the beginning like planning and design
which saves lots of testing time.
Most of the defects and bugs are found in the beginning of the project
development. So less chances of defect or bug to be occurred at final testing
phase.
Disadvantages of V Model
Guessing the error in the beginning of the project could take more time.
Less flexibility.
Any changes done in the middle of the development which is unplanned could
make difficult to make the changes at all the places like test document and
requirements.

When V Model should be followed.

V Model should be followed for small project where requirements are clear and
easy to understand at the beginning of development.
V Model should be followed for the project where very less probability to make
the changes in the middle of testing or development phase which are unplanned.

Agile development model is also a type of Incremental model. Software is developed

in incremental, rapid cycles. This results in small incremental releases with each
release building on previous functionality. Each release is thoroughly tested to ensure
software quality is maintained. It is used for time critical applications. Extreme
Programming (XP) is currently one of the most well known agile development life cycle
model.

Advantages of Agile model:

Customer satisfaction by rapid, continuous delivery of useful software.

People and interactions are emphasized rather than process and tools. Customers,
developers and testers constantly interact with each other.
Working software is delivered frequently (weeks rather than months).
Face-to-face conversation is the best form of communication.
Close, daily cooperation between business people and developers.
Continuous attention to technical excellence and good design.
Regular adaptation to changing circumstances.
Even late changes in requirements are welcomed

Disadvantages of Agile model:

In case of some software deliverables, especially the large ones, it is difficult to assess
the effort required at the beginning of the software development life cycle.
There is lack of emphasis on necessary designing and documentation.
The project can easily get taken off track if the customer representative is not clear
what final outcome that they want.
Only senior programmers are capable of taking the kind of decisions required during
the development process. Hence it has no place for newbie programmers, unless
combined with experienced resources.
When to use Agile model:

When new changes are needed to be implemented. The freedom agile gives to
change is very important. New changes can be implemented at very little cost because
of the frequency of new increments that are produced.
To implement a new feature the developers need to lose only the work of a few days,
or even only hours, to roll back and implement it.
Unlike the waterfall model in agile model very limited planning is required to get started
with the project. Agile assumes that the end users needs are ever changing in a
dynamic business and IT world. Changes can be discussed and features can be newly
effected or removed based on feedback. This effectively gives the customer the
finished system they want or need.
Both system developers and stakeholders alike, find they also get more freedom of
time and options than if the software was developed in a more rigid sequential way.
Having options gives them the ability to leave important decisions until more or better
data or even entire hosting programs are available; meaning the project can continue
to move forward without fear of reaching a sudden standstill.

Incremental Model: The incremental build model is a method of software

development where the model is designed, implemented and tested incrementally (a
little more is added each time) until the product is finished. It involves both
development and maintenance. The product is defined as finished when it satisfies all
of its requirements. This model combines the elements of the waterfall model with the
iterative philosophy of prototyping. The product is decomposed into a number of
components, each of which are designed and built separately (termed as builds). Each
component is delivered to the client when it is complete. This allows partial utilisation
of product and avoids a long development time. It also creates a large initial capital
outlay with the subsequent long wait avoided. This model of development also helps
ease the traumatic effect of introducing completely new system all at once. There are
some problems with this model. One is that each new build must be integrated with
previous builds and any existing systems. The task of decomposing product into builds
not trivial either. If there are too few builds and each build degenerates this turns into
Build-And-Fix model. However if there are too many builds then there is little added
utility from each build.

Advantages of Incremental Model

Generates working software quickly and early during the software life cycle.
More flexible less costly to change scope and requirements.
Easier to test and debug during a smaller iteration.
Easier to manage risk because risky pieces are identified and handled during its
iteration.
Each iteration is an easily managed milestone.
Disadvantages of Incremental Model

Each phase of an iteration is rigid and do not overlap each other.

Problems may arise pertaining to system architecture because not all
requirements are gathered up front for the entire software life cycle.

When to use Incremental Model

Such models are used where requirements are clear and can implement by
phase wise. From the figure its clear that the requirements is divided into R1,
R2.Rn and delivered accordingly.
Mostly such model is used in web applications and product based companies.

The Prototyping Model is applied when detailed information related to input and
output requirements of the system is not available. In this model, it is assumed that all
the requirements may not be known at the start of the development of the system. It is
usually used when a system does not exist or in case of a large and complex system
where there is no manual process to determine the requirements. This model allows
the users to interact and experiment with a working model of the system known
as prototype. The prototype gives the user an actual feel of the system.

Advantages of Prototype model:

Users are actively involved in the development

Since in this methodology a working model of the system is provided, the users get a
better understanding of the system being developed.
Errors can be detected much earlier.
Quicker user feedback is available leading to better solutions.
Missing functionality can be identified easily

Disadvantages of Prototype model:

Leads to implementing and then repairing way of building systems.

Practically, this methodology may increase the complexity of the system as scope of
the system may expand beyond original plans.

When to use Prototype model:

Prototype model should be used when the desired system needs to have a lot of
interaction with the end users.
Typically, online systems, web interfaces have a very high amount of interaction with
end users, are best suited for Prototype model. It might take a while for a system to be
built that allows ease of use and needs minimal training for the end user.
Prototyping ensures that the end users constantly work with the system and provide a
feedback which is incorporated in the prototype to result in a useable system. They are
excellent for designing good human computer interface systems.

Big Bang Model This model is the simplest model in its form. It requires little planning,
lots of programming and lots of funds. This model is conceptualized around the big
bang of universe. As scientists say that after big bang lots of galaxies, planets, and
stars evolved just as an event. Likewise, if we put together lots of programming and
funds, you may achieve the best software product. This model is not suitable for large
software projects but good one for learning and experimenting.

COCOMO Model: The Constructive Cost model was developed by Barry Boehm; this
is a type of software that is used to determine cost estimate. It works by combining a
regression formula with predetermined parameters that are derived through the data of
a particular project. The main cocomo model advantage is that you can determine the
costs that will be incurred when investing in a particular project. Another cocomo
model advantage is that the estimates and all other related information that is obtained
is factual, so your results are always accurate. You can also customize the structure of
the software to your convenience; this is yet another cocomo model advantage. The
best cocomo model advantage is that it can be repeated any number of times, this
means that you can calculate the cost of a particular project initially and determine how
changes and modifications will affect your initial project estimates. Ease of use is what
has made this model a popular one, the cocomo model advantage offered to its users
allows them to be in full control of the projects and all the costs entailed. Another
cocomo model advantage is that it is well documented and calibrated, offering precise
calculations.

COCOMO applies to three classes of software projects:

Organic projects - "small" teams with "good" experience working with "less than
rigid" requirements
Semi-detached projects - "medium" teams with mixed experience working with a
mix of rigid and less than rigid requirements
Embedded projects - developed within a set of "tight" constraints (hardware,
software, operational, ...)

Gantt Chart: A Gantt chart is a horizontal bar chart developed as a production control
tool in 1917 by Henry L. Gantt, an American engineer and social scientist. Frequently
used in project management, a Gantt chart provides a graphical illustration of a
schedule that helps to plan, coordinate, and track specific tasks in a project. Gantt
charts may be simple versions created on graph paper or more complex automated
versions created using project management applications such as Microsoft Project or
Excel. They can also be used for scheduling production processes and employee
rostering.In the latter context, they may also be known as timebar schedules. Gantt
charts can be used to track shifts or tasks and also vacations or other types of out-of-
office time. Specialized employee scheduling software may output schedules as a
Gantt chart, or they may be created through popular desktop publishing software.

A PERT chart is a project management tool that provides a graphical

representation of a project's timeline. PERT, or Program Evaluation Review
Technique, allows the tasks in a particular project to be analyzed. Although
PERT charts are preferable to Gantt charts because they more clearly identify
task dependencies, PERT charts are often more difficult to interpret. A PERT
chart utilizes circles or rectangles called nodes to represent events or
milestones within a project. These nodes are linked by vectors, or lines, that
represent various tasks. Dependent tasks are items that are required to be
performed in a specific manner. For example, if an arrow is drawn from task one
to task two on a PERT chart, task one must be completed before task two.
Items at the same stage of production within a project but on different task lines
are called parallel tasks. They are independent from one another and, although
they are planned to occur at the same time, they are not dependent on one
another. A PERT chart allows project managers to evaluate the time and
resources required for a project. This includes the ability to track assets needed
during any stage of production in the course of the entire project. PERT analysis
incorporates data and information from multiple departments. This encourages
department responsibility, identifies all responsible parties across the
organization, improves communication during the course of the project, and
allows commitment to projects that encompass the entire companys strategic
positioning. Finally, PERT charts are useful in performing what-if analysis.
Different possibilities regarding the flow of project resources and milestones
permit management to achieve the most efficient and useful project path.

Software testing is the process of evaluation a software item to detect differences

between given input and expected output. Also to assess the feature of A software
item. Testing assesses the quality of the product. Software testing is a process that
should be done during the development process. In other words software testing is a
verification and validation process.

VERIFICATION
Verification is the process to make sure the product satisfies the conditions imposed at
the start of the development phase. In other words, to make sure the product behaves
the way we want it to.

VALIDATION
Validation is the process to make sure the product satisfies the specified requirements
at the end of the development phase. In other words, to make sure the product is built
as per customer requirements.

Testing can either be done manually or using an automated testing tool:

Manual - This testing is performed without taking help of automated testing

tools. The software tester prepares test cases for different sections and levels of
the code, executes the tests and reports the result to the manager.Manual
testing is time and resource consuming. The tester needs to confirm whether or
not right test cases are used. Major portion of testing involves manual testing.
Automated This testing is a testing procedure done with aid of automated
testing tools. The limitations with manual testing can be overcome using
automated test tools.

Tests can be conducted based on two approaches

Functionality testing
Implementation testing

When functionality is being tested without taking the actual implementation in concern
it is known as black-box testing. The other side is known as white-box testing where
not only functionality is tested but the way it is implemented is also analyzed.

1)Black Box Testing:Black Box Testing, also known as Behavioral Testing, is a

software testing method in which the internal structure/ design/ implementation of the
item being tested is not known to the tester. These tests can be functional or non-
functional, though usually functional.
The above Black-Box can be any software system you want to test. For example: an
operating system like Windows, a website like Google, a database like Oracle or even
your own custom application. Under Black Box Testing, you can test these applications
by just focusing on the inputs and outputs without knowing their internal code
implementation.

This method attempts to find errors in the following categories:

Incorrect or missing functions

Interface errors
Errors in data structures or external database access
Behavior or performance errors
Initialization and termination errors

There are many types of Black Box Testing but following are the prominent ones
-

Functional testing - This black box testing type is related to functional

requirements of a system; it is done by software testers.
Non-functional testing - This type of black box testing is not related to testing
of a specific functionality, but non-functional requirements such as performance,
scalability, usability.
Regression testing - Regression testing is done after code fixes, upgrades or
any other system maintenance to check the new code has not affected the
existing code.

Black box testing has its own life cycle called Software Test Life Cycle (STLC)
and it is relative to every stage of Software Development Life Cycle.Some
famous Black Box testing techniques are Boundary value analysis, state
transition testing, equivalence partitioning.

2)White Box Testing:It is also known as Clear Box Testing, Open Box Testing, Glass
Box Testing, Transparent Box Testing, Code-Based Testing or Structural Testing.
It is a software testing method in which the internal structure/ design/ implementation
of the item being tested is known to the tester. The tester chooses inputs to exercise
paths through the code and determines the appropriate outputs. Programming know-
how and the implementation knowledge is essential. White box testing is testing
beyond the user interface and into the nitty-gritty of a system.

This method is named so because the software program, in the eyes of the tester, is
like a white/ transparent box; inside which one clearly sees.

White box testing, on its own, cannot identify problems caused by mismatches
between the actual requirements or specification and the code as implemented but it
can help identify some types of design weaknesses in the code. Examples include
control flow problems (e.g., closed or infinite loops or unreachable code), and data flow
problems (e.g., trying to use a variable which has no defined value). Static code
analysis (by a tool) may also find these sorts of problems, but doesn't help the
tester/developer understand the code to the same degree that personally designing
white-box test cases does.

3)Gray Box Testing:Gray Box Testing is a software testing method which is a

combination of Black Box Testing method and White Box Testing method. In Black Box
Testing, the internal structure of the item being tested is unknown to the tester and in
White Box Testing the internal structure in known. In Gray Box Testing, the internal
structure is partially known. This involves having access to internal data structures and
algorithms for purposes of designing the test cases, but testing at the user, or black-
box level.

Gray Box Testing is named so because the software program, in the eyes of the tester
is like a gray/ semi-transparent box; inside which one can partially see.

Gray Box Testing gives the ability to test both sides of an application, presentation
layer as well as the code part. It is primarily useful in Integration Testing and
Penetration Testing.Grey-box testing is a perfect fit for Web-based applications.Grey-
box testing is also a best approach for functional or domain testing.

Techniques used for Grey box Testing are-

Matrix Testing: This testing technique involves defining all the variables that
exist in their programs.
Regression Testing: To check whether the change in the previous version has
regressed other aspects of the program in the new version. It will be done by
testing strategies like retest all, retest risky use cases, retest within firewall.
Orthogonal Array Testing or OAT: It provides maximum code coverage with
minimum test cases.
Pattern Testing: This testing is performed on the historical data of the previous
system defects. Unlike black box testing, gray box testing digs within the code
and determines why the failure happened

Usually, Grey box methodology uses automated software testing tools to conduct the
testing. Stubs and module drivers are created to relieve tester to manually generate
the code.
There are many other types of testing like:

Acceptance Testing
Acceptance testing is often done by the customer to ensure that the delivered product
meets the requirements and works as the customer expected. It falls under the class of
black box testing.

Regression Testing
Regression testing is the testing after modification of a system, component, or a group
of related units to ensure that the modification is working correctly and is not damaging
or imposing other modules to produce unexpected results. It falls under the class of
black box testing.

Beta Testing
Beta testing is the testing which is done by end users, a team outside development, or
publicly releasing full pre-version of the product which is known as beta version. The
aim of beta testing is to cover unexpected errors. It falls under the class of black box
testing.

Unit Testing
Unit testing is the testing of an individual unit or group of related units. It falls under the
class of white box testing. It is often done by the programmer to test that the unit
he/she has implemented is producing expected output against given input.Statements,
functions, methods, interfaces i.e units of the code are individually tested for proper
execution. It can be automated or can be done manually. Usually small data is used
for unit testing.

Integration Testing
Integration testing is testing in which a group of components are combined to produce
output. Also, the interaction between software and hardware is tested in integration
testing if software and hardware components have any relation. It may fall under both
white box testing and black box testing. Different approaches used in integration
testing are: top down & bottom up integration testing, sandwich
testing (combination of both).

Stress Testing
Stress testing is the testing to evaluate how system behaves under unfavorable
conditions. Testing is conducted at beyond limits of the specifications. It falls under the
class of black box testing.

Performance Testing
Performance testing is the testing to assess the speed and effectiveness of the system
and to make sure it is generating results within a specified time as in performance
requirements. It falls under the class of black box testing.

Functional Testing
Functional testing is the testing to ensure that the specified functionality required in the
system requirements works. It falls under the class of black box testing.

System Testing
System testing is the testing to ensure that by putting the software in different
environments (e.g., Operating Systems) it still works. System testing is done with full
system implementation and environment. It falls under the class of black box testing. It
is performed after integration testing. Various approaches used are: load testing,
smoke testing, security testing, migration testing etc.

Usability Testing
Usability testing is performed to the perspective of the client, to evaluate how the GUI
is user-friendly? How easily can the client learn? After learning how to use, how
proficiently can the client perform? How pleasing is it to use its design? This falls under
the class of black box testing.
Data Flow Diagram: Data Flow Diagram (DFD) is a graphical representation of flow
of data in an information system. It is capable of depicting incoming data flow, outgoing
data flow, and stored data. The DFD does not mention anything about how data flows
through the system. There is a prominent difference between DFD and Flowchart. The
flowchart depicts flow of control in program modules. DFDs depict flow of data in the
system at various levels. It does not contain any control or branch elements.

Types of DFD Data Flow Diagrams are either Logical or Physical.

Logical DFD - This type of DFD concentrates on the system process, and flow of
data in the system. For example in a banking software system, how data is moved
between different entities.

Physical DFD - This type of DFD shows how the data flow is actually implemented in
the system. It is more specific and close to the implementation.

DFD Components
DFD can represent Source, destination, storage and flow of data using the following
set of components -

Entities - Entities are source and destination of information data. Entities are
represented by a rectangles with their respective names.
Process - Activities and action taken on the data are represented by Circle or
Round-edged rectangles.
Data Storage - There are two variants of data storage - it can either be
represented as a rectangle with absence of both smaller sides or as an open-
sided rectangle with only one side missing.
Data Flow - Movement of data is shown by pointed arrows. Data movement is
shown from the base of arrow as its source towards head of the arrow as
destination.
Levels of DFD:

DFD Level 0 is also called a Context Diagram. Its a basic overview of the whole
system or process being analyzed or modeled. Its designed to be an at-a-glance view,
showing the system as a single high-level process, with its relationship to external
entities. It should be easily understood by a wide audience, including stakeholders,
business analysts, data analysts and developers.

DFD Level 1 provides a more detailed breakout of pieces of the Context Level
Diagram. You will highlight the main functions carried out by the system, as you break
down the high-level process of the Context Diagram into its subprocesses.

DFD Level 2 then goes one step deeper into parts of Level 1. It may require more text
to reach the necessary level of detail about the systems functioning.

Structure chart is a chart derived from Data Flow Diagram. It represents the system
in more detail than DFD. It breaks down the entire system into lowest functional
modules, describes functions and sub-functions of each module of the system to a
greater detail than DFD. Structure chart represents hierarchical structure of modules.
At each layer a specific task is performed.

Decision Table Testing is a good way to deal with a combination of inputs, which
produce different results. It helps reduce test effort in verifying each and every
combinations of test data, at the same time ensuring complete coverage

Creating Decision Table

To create the decision table, the developer must follow basic four steps:

Identify all possible conditions to be addressed

Determine actions for all identified conditions
Create Maximum possible rules
Define action for each rule
Decision Tables should be verified by end-users and can lately be simplified by
eliminating duplicate rules and actions.

Data dictionary is the centralized collection of information about data. It

stores meaning and origin of data, its relationship with other data, data
format for usage etc. Data dictionary has rigorous definitions of all names
in order to facilitate user and software designers. Data dictionary is often
referenced as meta-data (data about data) repository. It is created along
with DFD (Data Flow Diagram) model of software program and is expected
to be updated whenever DFD is changed or updated.

There are two types of data dictionary Active and Passive.

Active Data Dictionary: Any changes to the database object structure via DDLs will
have to be reflected in the data dictionary. But updating the data dictionary tables for
the changes are responsibility of database in which the data dictionary exists. If the
data dictionary is created in the same database, then the DBMS software will
automatically update the data dictionary. Hence there will not be any mismatch
between the actual structure and the data dictionary details. Such data dictionary is
called active data dictionary.

Passive Data Dictionary: In some of the databases, data dictionary is created

separately from the current database as entirely new database to store only data
dictionary informations. Sometimes it is stored as xml, excels or in any other file
format. In such case, an effort is required to keep data dictionary in sync with the
database objects. This kind of data dictionary is called passive data dictionary. In this
case, there is a chance of mismatch with the database objects and the data dictionary.
This kind of DD has to be handled with utmost care.

User can change the structure of database objects by using DDLs. But users can not
change the structure/content of data dictionary tables/views. All the data dictionary
tables/views are controlled and managed by DBMS. Users do not have any
modification rights on them.

UML stands for Unified Modeling Language. UML 2.0 helped extend the original
UML specification to cover a wider portion of software development efforts including
agile practices.

Improved integration between structural models like class diagrams and behavior
models like activity diagrams.
Added the ability to define a hierarchy and decompose a software system into
components and sub-components.
The original UML specified nine diagrams; UML 2.x brings that number up to 13.
The four new diagrams are called: communication diagram, composite structure
diagram, interaction overview diagram, and timing diagram. It also renamed
statechart diagrams to state machine diagrams, also known as state diagrams.
Types of UML:
Structural UML diagrams

Class diagram
Class diagrams are the backbone of almost every object-oriented method,
including UML. They describe the static structure of a system.
Package diagram
Package diagrams are a subset of class diagrams, but developers sometimes
treat them as a separate technique. Package diagrams organize elements of a
system into related groups to minimize dependencies between packages.
Object diagram
Object diagrams describe the static structure of a system at a particular time.
They can be used to test class diagrams for accuracy.
Component diagram
Component diagrams describe the organization of physical software
components, including source code, run-time (binary) code, and executables.
Composite structure diagram
Composite structure diagrams show the internal part of a class.
Deployment diagram
Deployment diagrams depict the physical resources in a system, including
nodes, components, and connections.
Behavioral UML diagrams

Activity diagram
Activity diagrams illustrate the dynamic nature of a system by modeling the flow
of control from activity to activity. An activity represents an operation on some
class in the system that results in a change in the state of the system. Typically,
activity diagrams are used to model workflow or business processes and internal
operation.
Sequence diagram
Sequence diagrams describe interactions among classes in terms of an
exchange of messages over time.
Use case diagram
Use case diagrams model the functionality of a system using actors and use
cases.
State diagram
Statechart diagrams, now known as state machine diagrams and state diagrams
describe the dynamic behavior of a system in response to external stimuli. State
diagrams are especially useful in modeling reactive objects whose states are
triggered by specific events.

Communication diagram
Communication diagrams model the interactions between objects in sequence.
They describe both the static structure and the dynamic behavior of a system.
Interaction overview diagram
Interaction overview diagrams are a combination of activity and sequence
diagrams. They model a sequence of actions and let you deconstruct more
complex interactions into manageable occurrences.
Timing diagram
A timing diagram is a type of behavioral or interaction UML diagram that focuses
on processes that take place during a specific period of time. They're a special
instance of a sequence diagram, except time is shown to increase from left to
right instead of top down.

Software Quality:
Quality software is reasonably bug or defect free, delivered on time and within budget,
meets requirements and/or expectations, and is maintainable.

ISO 8402-1986 standard defines quality as the totality of features and characteristics
of a product or service that bears its ability to satisfy stated or implied needs.

Key aspects of quality for the customer include:

Good design looks and style

Good functionality it does the job well
Reliable acceptable level of breakdowns or failure
Consistency
Durable lasts as long as it should
Good after sales service
Value for money
Software Quality Assurance (SQA) is a set of activities for ensuring quality in
software engineering processes (that ultimately result in quality in software products).
It includes the following activities:

Process definition and implementation

Auditing
Training

Processes could be:

Software Development Methodology

Project Management
Configuration Management
Requirements Development/Management
Estimation
Software Design
Testing
etc

Once the processes have been defined and implemented, Quality Assurance has the
following responsibilities:

identify weaknesses in the processes

correct those weaknesses to continually improve the process

The quality management system under which the software system is created is
normally based on one or more of the following models/standards:

CMMI
Six Sigma
ISO 9000

Note: There are many other models/standards for quality management but the ones
mentioned above are the most popular.
Software Quality Assurance encompasses the entire software development life cycle
and the goal is to ensure that the development and/or maintenance processes are
continuously improved to produce products that meet specifications/requirements. The
process of Software Quality Control (SQC) is also governed by Software Quality
Assurance (SQA).SQA is generally shortened to just QA.
Software Quality Control (SQC) is a set of activities for ensuring quality
in software products.
It includes the following activities:

Reviews
o Requirement Review
o Design Review
o Code Review
o Deployment Plan Review
o Test Plan Review
o Test Cases Review
Testing
o Unit Testing
o Integration Testing
o System Testing
o Acceptance Testing

Software Quality Control is limited to the Review/Testing phases of

the Software Development Life Cycle and the goal is to ensure that the
products meet specifications/requirements.

Test Case:

A test case is a document, which has a set of test data, preconditions, expected
results and post conditions, developed for a particular test scenario in order to verify
compliance against a specific requirement. Test Case acts as the starting point for the
test execution, and after applying a set of input values, the application has a definitive
outcome and leaves the system at some end point or also known as execution post
condition.

Typical Test Case Parameters:

Test Case ID

Test Scenario

Test Case Description

Test Steps

Prerequisite

Test Data

Expected Result

Test Parameters
Actual Result

Environment Information

Comments

WRITING GOOD TEST CASES

As far as possible, write test cases in such a way that you test only one thing at
a time. Do not overlap or complicate test cases. Attempt to make your test
cases atomic.
Ensure that all positive scenarios and negative scenarios are covered.
Language:
o Write in simple and easy to understand language.
o Use active voice: Do this, do that.
o Use exact and consistent names (of forms, fields, etc).
Characteristics of a good test case:
o Accurate: Exacts the purpose.
o Economical: No unnecessary steps or words.
o Traceable: Capable of being traced to requirements.
o Repeatable: Can be used to perform the test over and over.
o Reusable: Can be reused if necessary.

Alpha Testing Phase:

This is the fist phase of actual software testing (alpha is the first letter of the Greek
alpha bet), in this phase, we use the white box/Black box/Gray box software testing
techniques to test the software product.
This is the in-house testing of the product in presence of developers in laboratory
setting. Alpha testing is done by developer himself, or separate testing team, or by
client.
Generally we perform all testing types in alpha testing phase. Alpha testing phase
ends with a feature freeze, indicating that no more features will be added to the
software.

Types of testing done by tester in Alpha phase:

1. Smoke testing.
2. Integration Testing.
3. System testing.
4. UI and Usability testing.
5. Functional Testing.
6. Security Testing.
7. Performance Testing.
8. Regression testing.
9. Sanity Testing.
10. Acceptance Testing.

Purpose of Alpha testing:

The purpose of the alpha testing is to validate the product in all perspective. Which can
be functional label, UI & usability label, security or performance label.
Suppose if we are going to release the build for 10 features, and 3 of them have
certain Block and Major issues so either we should resolve them or to release the
product with 7 feature in Beta.Before going to alpha generally in-house testers insure
that testing of all application area has been done, no Block/crash/Major issues remain.

Beta Testing Phase:

In software development, a beta test is the second phase of software testing in which a
sampling of the intended audience tries the product out. (Beta is the second letter of
the Greek alphabet.)
Originally, the term alpha test meant the first phase of testing in a software
development process. The first phase includes unit testing, component testing, and
system testing.
Beta testing can be considered "pre-release testing". Beta test versions of software are
now distributed to a wide audience (Selected group of real users, out side the
developmentenvironment) on the Web partly to give the program a "real-world" test
and partly to provide a preview of the next release.

Purpose of Beta testing:

The main objective behind the Beta testing is to get the feedback from the different
groups of customers and check the compatibility of the product in different kind of
networks, hardware's, impact of the different installed software on product, check the
usability of the product.
This is typically the first time that the software is available outside of the organization
that developed it. The users of a beta version are called beta testers.

Type of Beta:
Developers release either a closed beta or an open beta; closed beta versions are
released to a select group of individuals for a user test and are invitation only, while
open betas are from a larger group to the general public and anyone interested. The
testers report any bugs that they find, and sometimes suggest additional features they
think should be available in the final version.

Open betas serve the dual purpose of demonstrating a product to potential

consumers, and testing among an extremely wide user base likely to bring to light
obscure errors that a much smaller testing team may not find.
Gamma Testing Phase:
This is the third phase of software testing. Gamma testing is done once the software is
ready for release with specified requirements. This testing is done directly by skipping
all the in-house test activities (no need to do all in-house quality check).
The software is almost ready for final release. No feature development or
enhancement of the software is undertaken; tightly scoped bug fixes are the only code
you're allowed to write in this phase, and even then only for the most heinous and
debilitating of bugs

Gamma Check:
Gamma check is performed when the application is ready for release to the specified
requirements, this check is performed directly without going through all the testing
activities at home.

Purpose of Beta testing:

Objective of the gamma testing is to validate the all functional are of the product is
working fine or not and is product ready for release. Here QA do the testing
like acceptance testing.

TQM(Total Quality Management) can be defined as a management technique for

improving processes, products, services and the other approaches associated with
the product. It focusses on the entire business and NOT just on a particular project or
process.

Elements of TQM:
Root Cause Analysis

Customer-focused

Active Employee Participation

Process-oriented

Internal and External self Assessment

Continuous improvement

Making Well Informed Decisions

Effective Communication
Quality Control Tools:
Cause - Effect Diagram

Checklists

Histogram

Graphs

Pareto Charts

Tree Diagram

Arrow Diagram

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
DATA STRUCTURE
Data Structure is a way of collecting and organising data in such a way that we can perform
operations on these data in an effective way. Data Structures is about rendering data elements in
terms of some relationship, for better organization and storage. In simple language, Data
Structures are structures programmed to store ordered data, so that various operations can be
performed on it easily.

Data structure can be subdivided into two types:

1)Linear Data Structure

A data structure is said to be linear if its elements combine to form any specific order.
There are basically two techniques of representing such linear structure within
memory.

First way is to provide the linear relationships among all the elements
represented by means of linear memory location. These linear structures are
termed as arrays.
The second technique is to provide the linear relationship among all the
elements represented by using the concept of pointers or links. These linear
structures are termed as linked lists.

The common examples of linear data structure are:

Arrays
Queue
Stacks
Linked List

2)Non Linear Data Structure

This structure is mostly used for representing data that contains a hierarchical
relationship among various elements.

Examples of Non Linear Data Structures are listed below:

Tree
Graph

An Algorithm may be defined as a finite sequence of instructions each of which has a

clear meaning and can be performed with a finite amount of effort in a finite length of
time. The algorithm word originated from the Arabic word Algorism which is linked to
the name of the Arabic mathematician AI Khwarizmi. He is considered to be the first
algorithm designer for adding numbers.
Structure and Properties of Algorithm: An algorithm has the following structure

1. Input Step

2. Assignment Step

3. Decision Step

4. Repetitive Step

5. Output Step

Practical Algorithm Design Issues:

1. To save time (Time Complexity): A program that runs faster is a better program.

2. To save space (Space Complexity): A program that saves space over a competing
program is considerable desirable.

Efficiency of Algorithms: The performances of algorithms can be measured on the

scales of time and space. The performance of a program is the amount of computer
memory and time needed to run a program. We use two approaches to determine the
performance of a program. One is analytical and the other is experimental

Time Complexity: The time complexity of an algorithm or a program is a function of

the running time of the algorithm or a program. In other words, it is the amount of
computer time it needs to run to completion.

Space Complexity: The space complexity of an algorithm or program is a function of

the space needed by the algorithm or program to run to completion.

The time complexity of an algorithm can be computed either by an empirical or

theoretical approach. The empirical or posteriori testing approach calls for
implementing the complete algorithms and executing them on a computer for various
instances of the problem. The time taken by the execution of the programs for various
instances of the problem are noted and compared. The algorithm whose
implementation yields the least time is considered as the best among the candidate
algorithmic solutions.

Asymptotic analysis of an algorithm refers to defining the mathematical

boundation/framing of its run-time performance. Using asymptotic analysis, we can
very well conclude the best case, average case, and worst case scenario of an
algorithm.Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it
is concluded to work in a constant time. Other than the "input" all other factors are
considered constant.

Usually, the time required by an algorithm falls under three types

Best Case Minimum time required for program execution.

Average Case Average time required for program execution.

Worst Case Maximum time required for program execution.

Following are the commonly used asymptotic notations to calculate the running
time complexity of an algorithm.

Notation
Notation
Notation

1)Big Oh Notation,
The notation (n) is the formal way to express the upper bound of an algorithm's
running time. It measures the worst case time complexity or the longest amount of
time an algorithm can possibly take to complete.

2)Omega Notation,
The notation (n) is the formal way to express the lower bound of an algorithm's
running time. It measures the best case time complexity or the best amount of time an
algorithm can possibly take to complete.

3)Theta Notation,
The notation (n) is the formal way to express both the lower bound and the upper
bound of an algorithm's running time.

Linked List: A linked list is a linear collection of data elements, called nodes, where
the linear order is given by means of pointers. Each node is divided into two parts:

1. The first part contains the information of the element

2. The second part contains the address of the next node (link /next pointer field) in
the list. The data items in the linked list are not in consecutive memory locations. They
may be anywhere, but the accessing of these data items is easier as each data item
contains the address of the next data item.

Types of Linked List:

1. Singly Linked List/Linear Linked List : It is also called One Way List or Singly
Linked List. It is linear collection of data elements which are called Nodes. The
elements may or may not be stored in consecutive memory locations. So pointers are
used maintain linear order. Each node is divided into two parts. The first part contains
the information of the element and is called INFO Field. The second part contains the
address of the next node and is called LINK Field or NEXT Pointer Field. The
START contains the starting address of the linked list i.e. it contains the address of the
first node of the linked list. The LINK Field of last node contains NULL Value which
indicates that it is the end of linked list. The operations we can perform on singly linked
lists are insertion, deletion and traversal.

2. Doubly Linked List : In this type of Linked list, there are two references associated
with each node, One of the reference points to the next node and one to the previous
node. Advantage of this data structure is that we can traverse in both the directions
and for deletion we dont need to have explicit access to previous node.

3. Circular Linked List : Circular linked list is a linked list where all nodes are
connected to form a circle. There is no NULL at the end. A circular linked list can be a
singly circular linked list or doubly circular linked list. Advantage of this data structure is
that any node can be made as starting node. This is useful in implementation of
circular queue in linked list. Circular Doubly Linked Lists are used for implementation of
advanced data structures like Fibonacci Heap.

Advantages of Linked Lists

They are a dynamic in nature which allocates the memory when required.
Insertion and deletion operations can be easily implemented.
Stacks and queues can be easily executed.
Linked List reduces the access time.

Disadvantages of Linked Lists

The memory is wasted as pointers require extra memory for storage.

No element can be accessed randomly; it has to access each node sequentially.
Reverse Traversing is difficult in linked list.

Applications of Linked Lists

Linked lists are used to implement stacks, queues, graphs, etc.

Linked lists let you insert elements at the beginning and end of the list.
In Linked Lists we dont need to know the size in advance.

Stack:

Abstract Data Type

A stack is a container of objects that are inserted and removed according to the
last-in first-out (LIFO) principle. In the pushdown stacks only two operations are
allowed: push the item into the stack, and pop the item out of the stack. A stack
is a limited access data structure - elements can be added and removed from
the stack only at the top. push adds an item to the top of the
stack, pop removes the item from the top.

One of the most interesting applications of stacks can be found in solving a

puzzle called Tower of Hanoi. According to an old Brahmin story, the existence
of the universe is calculated in terms of the time taken by a number of monks,
who are working all the time, to move 64 disks from one pole to another. But
there are some rules about how this should be done, which are:
1. You can move only one disk at a time.
2. For temporary storage, a third pole may be used.
3. You cannot place a disk of larger diameter on a disk of smaller diameter.
To use a stack efficiently, we need to check the status of stack as well. For the
same purpose, the following functionality is added to stacks
o peek() get the top data element of the stack, without removing it.
o isFull() check if stack is full.
o isEmpty() check if stack is empty.

At all times, we maintain a pointer to the last PUSHed data on the stack. As this
pointer always represents the top of the stack, hence named top.
The top pointer provides top value of the stack without actually removing it.

Run-time complexity of stack operations:

For all the standard stack operations (push, pop, isEmpty, size), the worst-case
run-time complexity can be O(1). We say can and not is because it is always
possible to implement stacks with an underlying representation that is
inefficient. However, with the representations we have looked at (static array
and a reasonable linked list) these operations take constant time. It's obvious
that size and isEmpty constant-time operations. push and pop are also O(1)
because they only work with one end of the data structure - the top of the stack.
The upshot of all this is that stacks can and should be implemented easily and
efficiently.The copy constructor and assignment operator are O(n), where n is
the number of items on the stack. This is clear because each item has to be
copied (and copying one item takes constant time). The destructor takes linear
time (O(n)) when linked lists are used - the underlying list has to be traversed
and each item released (releasing the memory of each item is constant in terms
of the number of items on the whole list).

2)Queue:

Abstract data type

First element is inserted from one end called REAR(also called tail), and the
deletion of existing element takes place from the other end called
as FRONT(also called head). This makes queue as FIFO data structure, which
means that element inserted first will also be removed first.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
The following are operations performed by queue in data structures:

Enqueue (Add operation)

Dequeue (Remove operation)
Initialize

Enqueue
This operation is used to add an item to the queue at the rear end. So, the head of the
queue will be now occupied with an item currently added in the queue. Head count will
be incremented by one after addition of each item until the queue reaches the tail
point. This operation will be performed at the rear end of the queue.

Dequeue
This operation is used to remove an item from the queue at the front end. Now the tail
count will be decremented by one each time when an item is removed from the queue
until the queue reaches the head point. This operation will be performed at the front
end of the queue.

Initialize
This operation is used to initialize the queue by representing the head and tail
positions in the memory allocation table (MAT).
Few more functions are required to make the above-mentioned queue operation
efficient. These are

peek() Gets the element at the front of the queue without removing it.
isfull() Checks if the queue is full.
isempty() Checks if the queue is empty.

In queue, we always dequeue (or access) data, pointed by front pointer and while
enqueing (or storing) data in the queue we take help of rear pointer.

Run time Complexity of queue Operations:

Insert: O(1)
Remove: O(1)
Size: O(1)

Circular Queue: In a standard queue data structure re-buffering problem occurs for
each dequeue operation. To solve this problem by joining the front and rear ends of a
queue to make the queue as a circular queue.Circular queue is a linear data structure.
It follows FIFO principle.

In circular queue the last node is connected back to the first node to make
a circle.
Circular linked list follow the First In First Out principle
Elements are added at the rear end and the elements are deleted at front end
of the queue
Both the front and the rear pointers points to the beginning of the array.
It is also called as Ring buffer.
Items can inserted and deleted from a queue in O(1) time.

Circular Queue can be created in three ways they are

1)Using single linked list
2)Using double linked list
3)Using arrays

Sorting
Sorting is nothing but storage of data in sorted order, it can be in ascending or
descending order. The term Sorting comes into picture with the term Searching. There
are so many things in our real life that we need to search, like a particular record in
database, roll numbers in merit list, a particular telephone number, any particular page
in a book etc.Sorting arranges data in a sequence which makes searching easier.
Every record which is going to be sorted will contain one key. Based on the key the
record will be sorted.

There are many types of Sorting techniques, differentiated by their efficiency

and space requirements. Following are some sorting techniques which we will
be covering in next sections.

1. Bubble Sort
2. Insertion Sort
3. Selection Sort
4. Quick Sort
5. Merge Sort
6. Heap Sort

Bubble Sort
Bubble Sort is probably one of the oldest, most easiest, straight-forward, inefficient
sorting algorithms. It works by comparing each element of the list with the element next
to it and swapping them if required. With each pass, the largest of the list is "bubbled"
to the end of the list whereas the smaller values sink to the bottom. This way the
number of passes would be equal to size of array 1.

Selection Sort
The idea of selection sort is rather simple: we repeatedly find the next largest (or
smallest) element in the array and move it to its final position in the sorted array.
Assume that we wish to sort the array in increasing order, i.e. the smallest element at
the beginning of the array and the largest element at the end. We begin by selecting
the largest element and moving it to the highest index position. We can do this by
swapping the element at the highest index and the largest element. We then reduce
the effective size of the array by one element and repeat the process on the smaller
(sub)array. The process stops when the effective size of the array becomes 1 (an array
of 1 element is already sorted).

Insertion Sort
The Insertion Sort algorithm is a commonly used algorithm. Even if you haven't been a
programmer or a student of computer science, you may have used this algorithm. Try
recalling how you sort a deck of cards. You start from the begining, traverse through
the cards and as you find cards misplaced by precedence you remove them and insert
them back into the right position. Eventually what you have is a sorted deck of cards.
The same idea is applied in the Insertion Sort algorithm.

ShellSort
ShellSort is mainly a variation of Insertion Sort. In insertion sort, we move elements
only one position ahead. When an element has to be moved far ahead, many
movements are involved. The idea of shellSort is to allow exchange of far items. In
shellSort, we make the array h-sorted for a large value of h. We keep reducing the
value of h until it becomes 1. An array is said to be h-sorted if all sublists of every hth
element is sorted.

Heap Sort
Heap sort is a comparison based sorting technique based on Binary Heap data
structure. It is similar to selection sort where we first find the maximum element and
place the maximum element at the end. We repeat the same process for remaining
element.

Merge Sort
MergeSort is a Divide and Conquer algorithm. It divides input array in two halves, calls
itself for the two halves and then merges the two sorted halves.

Quick sort
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as
pivot and partitions the given array around the picked pivot. There are many different
versions of quickSort that pick pivot in different ways.
1) Always pick first element as pivot.
2) Always pick last element as pivot (implemented below)
3) Pick a random element as pivot.
4) Pick median as pivot.
The key process in quickSort is partition().
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

n^2 mean n power 2

Radix Sort.
In this method, sorting is done based on the place values of the number. In this
scheme, sorting is done on the less-significant digits first. When all the numbers are
sorted on a more significant digit, numbers that have the same digit in that position but
different digits in a less-significant position are already sorted on the less-significant
position.

A Linear Search is the basic and simple search algorithm. A linear search searches
an element or value from an array till the desired element or value is not found and it
searches in a sequence order. It compares the element with all the other elements
given in the list and if the element is matched it returns the value index else it return -1.
Linear Search is applied on the unsorted or unordered list when there are fewer
elements in a list. In complexity terms this is an O(n) search - the time taken to search
the list gets bigger at the same rate as the list does.

Binary Search is applied on the sorted array or list. In binary search, we first compare
the value with the elements in the middle position of the array. If the value is matched,
then we return the value. If the value is less than the middle element, then it must lie in
the lower half of the array and if it's greater than the element then it must lie in the
upper half of the array. We repeat this procedure on the lower (or upper) half of the
array. Binary Search is useful when there are large numbers of elements in an array.
In complexity terms this is an O(log n) search - the number of search operations
grows more slowly than the list does, because you're halving the "search space" with
each operation.

Interpolation search is an improved variant of binary search. This search algorithm

works on the probing position of the required value. For this algorithm to work
properly, the data collection should be in a sorted form and equally distributed. Binary
search has a huge advantage of time complexity over linear search. Linear search
has worst-case complexity of (n) whereas binary search has (log n).

Hashing is a technique that is used to uniquely identify a specific object from a group
of similar objects. Some examples of how hashing is used in our lives include:

In universities, each student is assigned a unique roll number that can be used
to retrieve information about them.
In libraries, each book is assigned a unique number that can be used to
determine information about the book, such as its exact position in the library or
the users it has been issued to etc.
In both these examples the students and books were hashed to a unique number.

Assume that you have an object and you want to assign a key to it to make searching
easy. To store the key/value pair, you can use a simple array like a data structure
where keys (integers) can be used directly as an index to store values. However, in
cases where the keys are large and cannot be used directly as an index, you should
use hashing.

In hashing, large keys are converted into small keys by using hash functions. The
values are then stored in a data structure called hash table. The idea of hashing is to
distribute entries (key/value pairs) uniformly across an array. Each element is assigned
a key (converted key). By using that key you can access the element in O(1) time.
Using the key, the algorithm (hash function) computes an index that suggests where
an entry can be found or inserted.

Hashing is implemented in two steps:

1. An element is converted into an integer by using a hash function. This element

can be used as an index to store the original element, which falls into the hash
table.
2. The element is stored in the hash table where it can be quickly retrieved using
hashed key.

hash = hashfunc(key)
index = hash % array_size

In this method, the hash is independent of the array size and it is then reduced to an
index (a number between 0 and array_size 1) by using the modulo operator (%).

Hash function
A hash function is any function that can be used to map a data set of an arbitrary size
to a data set of a fixed size, which falls into the hash table. The values returned by a
hash function are called hash values, hash codes, hash sums, or simply hashes.

To achieve a good hashing mechanism, It is important to have a good hash

function with the following basic requirements:

1. Easy to compute: It should be easy to compute and must not become an

algorithm in itself.
2. Uniform distribution: It should provide a uniform distribution across the hash
table and should not result in clustering.
3. Less collisions: Collisions occur when pairs of elements are mapped to the
same hash value. These should be avoided.
Following are the ways to handle collisions:
Chaining:The idea is to make each cell of hash table point to a linked list of
records that have same hash function value. Chaining is simple, but requires
additional memory outside the table.
Open Addressing: In open addressing, all elements are stored in the hash table
itself. Each table entry contains either a record or NIL. When searching for an
element, we one by one examine table slots until the desired element is found or
it is clear that the element is not in the table.

Applications

Associative arrays: Hash tables are commonly used to implement many types of
in-memory tables. They are used to implement associative arrays (arrays whose
indices are arbitrary strings or other complicated objects).
Database indexing: Hash tables may also be used as disk-based data
structures and database indices (such as in dbm).
Caches: Hash tables can be used to implement caches i.e. auxiliary data tables
that are used to speed up the access to data, which is primarily stored in slower
media.
Object representation: Several dynamic languages, such as Perl, Python,
JavaScript, and Ruby use hash tables to implement objects.
Hash Functions are used in various algorithms to make their computing faster

A Greedy Algorithm is a mathematical process that looks for simple, easy-to-

implement solutions to complex, multi-step problems by deciding which next step will
provide the most obvious benefit. Such algorithms are called greedy because while the
optimal solution to each smaller instance will provide an immediate output, the
algorithm doesnt consider the larger problem as a whole. Once a decision has been
made, it is never reconsidered.

Greedy algorithms work by recursively constructing a set of objects from the smallest
possible constituent parts. Recursion is an approach to problem solving in which the
solution to a particular problem depends on solutions to smaller instances of the same
problem. The advantage to using a greedy algorithm is that solutions to smaller
instances of the problem can be straightforward and easy to understand. The
disadvantage is that it is entirely possible that the most optimal short-term solutions
may lead to the worst possible long-term outcome.

Greedy algorithms are often used in ad hoc mobile networking to efficiently

route packets with the fewest number of hops and the shortest delay possible. They
are also used in machine learning, business intelligence (BI), artificial intelligence (AI)
and programming.

Most networking algorithms use the greedy approach. Here is a list of few of
them

Travelling Salesman Problem

Prim's Minimal Spanning Tree Algorithm
Kruskal's Minimal Spanning Tree Algorithm
Dijkstra's Minimal Spanning Tree Algorithm
Graph - Map Coloring
Graph - Vertex Cover
Knapsack Problem
Job Scheduling Problem

The following computer algorithms are based on divide-and-conquer

programming approach

Merge Sort
Quick Sort
Binary Search
Strassen's Matrix Multiplication
Closest pair (points)
Dynamic programming approach is similar to divide and conquer in breaking
down the problem into smaller and yet smaller possible sub-problems. But
unlike, divide and conquer, these sub-problems are not solved independently.
Rather, results of these smaller sub-problems are remembered and used for
similar or overlapping sub-problems. Dynamic programming is used where we
have problems, which can be divided into similar sub-problems, so that their
results can be re-used. Mostly, these algorithms are used for optimization.
Before solving the in-hand sub-problem, dynamic algorithm will try to examine
the results of the previously solved sub-problems. The solutions of sub-
problems are combined in order to achieve the best solution. Dynamic
programming can be used in both top-down and bottom-up manner.

The following computer problems can be solved using dynamic programming

approach

Fibonacci number series

Knapsack problem
Tower of Hanoi
All pair shortest path by Floyd-Warshall
Shortest path by Dijkstra
Project scheduling

Graph

Graph is a data structure that consists of following two components:

1. A finite set of vertices also called as nodes.
2. A finite set of ordered pair of the form (u, v) called as edge. The pair is ordered
because (u, v) is not same as (v, u) in case of directed graph(di-graph). The pair of
form (u, v) indicates that there is an edge from vertex u to vertex v. The edges may
contain weight/value/cost.

Two are the most commonly used representations of graph:

1. Adjacency Matrix: Adjacency Matrix is a 2D array of size V x V where V is the
number of vertices in a graph. Adjacency matrix for undirected graph is always
symmetric. Adjacency Matrix is also used to represent weighted graphs.
2. Adjacency List: An array of linked lists is used. Size of the array is equal to number
of vertices

Tree: A tree is an ideal data structure for representing hierarchical data. A tree can be
thorectically defined as a finite set of one or more data items(nodes).

There is a special node called the root of the tree.

Removing nodes(or data item) are partitioned into number of mutually exclusive.

Following are the important terms with respect to tree:

Path Path refers to the sequence of nodes along the edges of a tree.

Root The node at the top of the tree is called root. There is only one root per
tree and one path from the root node to any node.

Parent Any node except the root node has one edge upward to a node called
parent.
Child The node below a given node connected by its edge downward is
called its child node.

Leaf The node which does not have any child node is called the leaf node.

Subtree Subtree represents the descendants of a node.

Visiting Visiting refers to checking the value of a node when control is on the
node.

Traversing Traversing means passing through nodes in a specific order.

Levels Level of a node represents the generation of a node. If the root node
is at level 0, then its next child node is at level 1, its grandchild is at level 2, and
so on.

keys Key represents a value of a node based on which a search operation is

to be carried out for a node.

Degree of a tree is the maximum degree of node in a gicen tree.A node with
degree zero is called terminal node or a leaf.

For a Binary Tree to be a binary search tree, the data of all the nodes in the left sub-tree of the
root node should be the data of the root. The data of all the nodes in the right subtree of the
root node should be >> the data of the root.

Complete Binary Tree: A Binary Tree is complete Binary Tree if all levels are completely
filled except possibly the last level and the last level has all keys as left as possible
In Fig. 1, consider the root node with data = 10.

Data in the left subtree is: [5,1,6][5,1,6]

All data elements are << 1010
Data in the right subtree is: [19,17][19,17]
All data elements are >> 1010

Also, considering the root node with data=5data=5, its children also satisfy the
specified ordering. Similarly, the root node with data=19data=19 also satisfies this
ordering. When recursive, all subtrees satisfy the left and right subtree ordering.

The tree is known as a Binary Search Tree or BST.

Traversing the tree

There are mainly three types of tree traversals.

Pre-order traversal

In this traversal technique the traversal order is root-left-right i.e.

Process data of root node

First, traverse left subtree completely
Then, traverse right subtree

Post-order traversal

In this traversal technique the traversal order is left-right-root.

Process data of left subtree
First, traverse right subtree
Then, traverse root node

In-order traversal

In in-order traversal, do the following:

First process left subtree (before processing root node)

Then, process current root node
Process right subtree

Trees are so useful and frequently used, because they have some very serious
advantages:

Trees reflect structural relationships in the data

Trees are used to represent hierarchies
Trees provide an efficient insertion and searching
Trees are very flexible data, allowing to move subtrees around with minumum
effort

A binary tree is balanced if height of the tree is O(Log n) where n is number of

nodes. Balanced Binary Search trees are performance wise good as they provide
O(log n) time for search, insert and delete.

A degenerate (or pathological) tree: A Tree where every internal node has one child.
Such trees are performance-wise same as linked list.

AVL Tree: One of the more popular balanced trees, known as an AVL tree in Data
Structures, was introduced in 1962 by Adelson-Velski and Landis. An Avl trees is a
binary search tree in which for every node in the tree, The height of the left and right
Sub trees differ by at most1.
Importance of Rotations :

The insert and delete operations of AVL tree are the same as binary search tree (BST)
Since an insertion(deletion) involve adding (deleting) a tree node, this can only
increase (decrease) the heights of same sub tree(s) by 1
Thus, the AVL tree property may be violated
If the AVL tree property is violated ata node x, it means that the height of left(x) and
right(x) differ by exactly 2
After the insertion or deletion operations, we need to examine the tree and see if any
node violates the AVL tree property
If the AVL tree property is violated at node so, single or double rotation will be applied
to x to restore the AVL tree property.
Rotation will be applied in a bottom up manner starting at the place of
insertion(deletion)
Thus when we perform a rotation at x, The AVL tree property is restored at all proper
descendants of x.

Spanning Tree: A spanning tree is a subset of Graph G, which has all the vertices
covered with minimum possible number of edges. Hence, a spanning tree does not
have cycles and it cannot be disconnected. A complete undirected graph can have
maximum nn-2 number of spanning trees, where n is the number of nodes.

Few properties of the spanning tree connected to graph G

A connected graph G can have more than one spanning tree.

All possible spanning trees of graph G, have the same number of edges and
vertices.
The spanning tree does not have any cycle (loops).

Removing one edge from the spanning tree will make the graph disconnected,
i.e. the spanning tree is minimally connected.

Adding one edge to the spanning tree will create a circuit or loop, i.e. the
spanning tree is maximally acyclic.

In a weighted graph, a minimum spanning tree is a spanning tree that has minimum
weight than all other spanning trees of the same graph. In real-world situations, this
weight can be measured as distance, congestion, traffic load or any arbitrary value
denoted to the edges.
Two Important Minimum Spanning Tree:
1)Kruskal Algorithm
Kruskal's algorithm is a greedy algorithm in graph theory that finds a minimum
spanning tree for a connected weighted graph.
It finds a subset of the edges that forms a tree that includes every vertex, where the
total weight of all the edges in the tree is minimized.
This algorithm is directly based on the MST( minimum spanning tree) property.
2)Prims Algorithm
Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a
connected weighted undirected graph.It finds a subset of the edges that forms a tree
that includes every vertex, where the total weight of all the edges in the tree is
minimized.This algorithm is directly based on the MST( minimum spanning tree)
property.
BFS DFS
BFS Stands for Breadth First Search. DFS stands for Depth First Search.

BFS starts traversal from the root node and DFS starts the traversal from the root node and explore
then explore the search in the level by level the search as far as possible from the root node i.e.
manner i.e. as close as possible from the root depth wise.
node.

Breadth First Search can be done with the help Depth First Search can be done with the help
of queue i.e. FIFO implementation. of Stack i.e. LIFO implementations.

This algorithm works in single stage. The This algorithm works in two stages in the first stage
visited vertices are removed from the queue the visited vertices are pushed onto the stack and later
and then displayed at once. on when there is no vertex further to visit those are
popped-off.

BFS is slower than DFS. DFS is more faster than BFS.

BFS requires more memory compare to DFS. DFS require less memory compare to BFS.

Applications of BFS Applications of DFS

> To find Shortest path > Useful in Cycle detection
> Single Source & All pairs shortest paths > In Connectivity testing
> In Spanning tree > Finding a path between V and W in the graph.
> In Connectivity > useful in finding spanning trees & forest.

BFS is useful in finding shortest path.BFS can DFS in not so useful in finding shortest path. It is used
be used to find the shortest distance between to perform a traversal of a general graph and the idea
some starting node and the remaining nodes of of DFS is to make a path as long as possible, and then
the graph. go back (backtrack) to add branches also as long as
possible.
B Tree B+ Tree

A B tree is an organizational
structure for information B+ tree is an n-array tree with
storage and retrieval in the a variable but often large
form of a tree in which all number of children per node.
terminal nodes are at the A B+ tree consists of a root,
Description
same distance from the base, internal nodes and leaves.
and all non-terminal nodes The root may be either a leaf
have between n and 2 n sub- or a node with two or more
trees or pointers (where n is children.
an integer).

Also known as Balanced tree. B plus tree.

Space O(n) O(n)

Search O(log n)
O(logb n)

Insert O(log n)
O(logb n)

Delete O(log n)
O(logb n)

In a B tree, search keys and

In a B+ tree, data stored only
Storage data stored in internal or leaf
in leaf nodes.
nodes.

The leaf nodes of the tree

The leaf nodes of the three
stores the actual record
Data store pointers to records
rather than pointers to
rather than actual records.
records.

There trees do not waste

Space These trees waste space
space.
In B+ tree, leaf node data are
In B tree, the leaf node
Function of leaf nodes ordered in a sequential linked
cannot store using linked list.
list.

Here, searching becomes Here, searching of any data

difficult in B- tree as data in a B+ tree is very easy
Searching
cannot be found in the leaf because all data is found in
node. leaf nodes.

Here in B tree the search is

Here in B+ tree the searching
Search accessibility not that easy as compared to
becomes easy.
a B+ tree.

They do not store redundant They store redundant search

Redundant key
search key. key.

They are an older version Many database system

and are not that implementers prefer the
Applications
advantageous as compared structural simplicity of a B+
to the B+ trees. tree.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
OPERATING SYSTEM

An operating system is a program that manages the computer hardware. It also

provides a basis for application programs and acts as an intermediary between the
computer user and the computer hardware. An amazing aspect of operating systems is
how varied they are in accomplishing these tasks. Mainframe operating systems are
designed primarily to optimize utilization of hardware. Personal computer (PC)
operating systems support complex games, business applications and everything in
between. Operating systems for handheld computers are designed to provide an
environment in which a user can easily interface with the computer to execute
programs. Thus some operating system are designed to be convenient others to be
efficient and others some combination of the two. The operating system acts as a
manager of the above resources and allocates them to specific programs and users as
necessary for their task. Therefore operating system is the resource manager i.e. it can
manage the resource of a computer system internally. The resources are processor,
memory, files, and I/O devices.

Types Of Operating System:

1) Serial Processing: The Serial Processing Operating Systems are those which
Performs all the instructions into a Sequence Manner or the Instructions those are
given by the user will be executed by using the FIFO Manner means First in First
Out. All the Instructions those are Entered First in the System will be Executed First
and the Instructions those are Entered Later Will be Executed Later. For Running the
Instructions the Program Counter is used which is used for Executing all the
Instructions.
In this the Program Counter will determines which instruction is going to Execute and
the which instruction will be Execute after this. Mainly the Punch Cards are used for
this. In this all the Jobs are firstly Prepared and Stored on the Card and after that
card will be entered in the System and after that all the Instructions will be executed
one by One. But the Main Problem is that a user doesnt interact with the
System while he is working on the System, means the user cant be able to enter the
data for Execution.
2) Batch Processing: The Batch Processing is same as the Serial Processing
Technique. But in the Batch Processing Similar Types of jobs are Firstly
Prepared and they are Stored on the Card. and that card will be Submit to the System
for the Processing. The System then Perform all the Operations on the Instructions
one by one. And a user cant be Able to specify any input. And Operating
System wills increments his Program Counter for Executing the Next Instruction.
The Main Problem is that the Jobs those are prepared for Execution must be the
Same Type and if a job requires for any type of Input then this will not be Possible for
the user. And Many Time will be wasted for Preparing the Batch. The Batch Contains
the Jobs and all those jobs will be executed without the user Intervention.
And Operating System will use the LOAD and RUN Operation. This will first
LOAD the Job from the Card and after that he will execute the instructions. By
using the RUN Command.
The Speed of the Processing the Job will be Depend on the Jobs and the Results
those are produced by the System in difference of Time which is used for giving or
submit the Job and the Time which is used for Displaying the Results on the Screen.
3) Multi-Programming: As we know that in the Batch Processing System there are
multiple jobs Execute by the System. The System first prepare a batch and after that
he will Execute all the jobs those are Stored into the Batch. But the Main Problem is
that if a process or job requires an Input and Output Operation, then it is not possible
and second there will be the wastage of the Time when we are preparing the batch
and the CPU will remain idle at that Time.
But With the help of Multi programming we can Execute Multiple Programs on the
System at a Time and in the Multi-programming the CPU will never get idle, because
with the help of Multi-Programming we can Execute Many Programs on the System
and When we are Working with the Program then we can also Submit the Second or
Another Program for Running and the CPU will then Execute the Second Program
after the completion of the First Program. And in this we can also specify our Input
means a user can also interact with the System.
The Multi-programming Operating Systems never use any cards because the Process
is entered on the Spot by the user. But the Operating System also uses the Process
of Allocation and De-allocation of the Memory Means he will provide the Memory
Space to all the Running and all the Waiting Processes. There must be the Proper
Management of all the Running Jobs.
4) Real Time System: There is also an Operating System which is known as Real
Time Processing System. In this Response Time is already fixed. Means time to
Display the Results after Possessing has fixed by the Processor or CPU. Real Time
System is used at those Places in which we Requires higher and Timely
Response. These Types of Systems are used in Reservation. So when we specify the
Request, the CPU will perform at that Time. There are two Types of Real Time System
1) Hard Real Time System: In the Hard Real Time System, Time is fixed and we
cant Change any Moments of the Time of Processing. Means CPU will Process the
data as we Enters the Data.
2) Soft Real Time System: In the Soft Real Time System, some Moments can be
Change. Means after giving the Command to the CPU, CPU Performs the Operation
after a Microsecond.
5) Distributed Operating System. - Distributed Means Data is Stored and Processed
on Multiple Locations. When a Data is stored on to the Multiple Computers, those are
placed in Different Locations. Distributed means In the Network, Network Collections of
Computers are connected with each other. Then if we want to Take Some Data From
other Computer, Then we uses the Distributed Processing System. And we can also
Insert and Remove the Data from out Location to another Location. In this Data is
shared between many users. And we can also Access all the Input and Output Devices
are also accessed by Multiple Users.
6) Multiprocessing: Generally a Computer has a Single Processor means a
Computer have a just one CPU for Processing the instructions. But if we are Running
multiple jobs, then this will decrease the Speed of CPU. For Increasing the Speed of
Processing then we uses the Multiprocessing, in the Multi Processing there are two or
More CPU in a Single Operating System if one CPU will fail, then other CPU is used
for providing backup to the first CPU. With the help of Multi-processing, we can
Execute Many Jobs at a Time. All the Operations are divided into the Number of
CPUs. if first CPU Completed his Work before the Second CPU, then the Work of
Second CPU will be divided into the First and Second.
7) Parallel operating systems are used to interface multiple
networked computers to complete tasks in parallel. The architecture of the
software is often a UNIX-based platform,which allows it to coordinate distributed
loads between multiple computers in a network.
Parallel operating systems are able to use software to manage all of the different
resources of the computers running in parallel, such as memory, caches, storage
space, and processing power. Parallel operating systems also allow a user to directly
interface with all of the computers in the network. A parallel operating system works by
dividing sets of calculations into smaller parts and distributing them between the
machines on a network. To facilitate communication between the processor cores
and memory arrays, routing software has to either share its memory by assigning
the same address space to all of the networked computers, or distribute its
memory by assigning a different address space to each processing core.
Sharing memory allows the operating system to run very quickly, but it is usually not as
powerful. When using distributed shared memory, processors have access to both
their own local memory and the memory of other processors; this distribution may slow
the operating system, but it is often more flexible and efficient.

Operating System Management Tasks

1. Processor management which involves putting the tasks into order and pairing
them into manageable size before they go to the CPU.
2. Memory management which coordinates data to and from RAM (random-access
memory) and determines the necessity for virtual memory.
3. Device management which provides interface between connected devices.
4. Storage management which directs permanent data storage.
5. Application which allows standard communication between software and your
computer.
6. User interface which allows you to communicate with your computer.

Operating system makes the programming task easier. The common services
provided by the operating system is listed below.

o Program execution
o I/O operation
o File system manipulation
o Communications
o Error detection.

Program execution: Operating system loads a program into

memory and executes the program. The program must be able to
end its execution, either normally or abnormally.
I/O operation: I/O means any file or any specific I/O device.
Program may require any I/O device while running. So operating
system must provide the required I/O.
File system manipulation: Program needs to read a file or write a
file. The operating system gives the permission to the program for
operation on file.
Communication: Data transfer between two processes is required
for some lime. The both processes are on the one computer or on
different computer but connected through computer network
Communication may be implemented by two methods: shared
memory and message passing.
Error detection: Error may occur in CPU, in I/O device or in the
memory hardware. The operating system constantly needs to be
aware of possible errors. It should take the appropriate action to
ensure correct and consistent computing.
Operating system with multiple users provide following services.

Resource allocation
Accounting
Protection

Process:

A process is an instance of a program in execution.

Batch systems work in terms of "jobs". Many modern process concepts are still
expressed in terms of jobs, ( e.g. job scheduling ), and the two terms are often
used interchangeably.

Process memory is divided into four sections:

o The text section comprises the compiled program code, read in from non-
volatile storage when the program is launched.
o The data section stores global and static variables, allocated and
initialized prior to executing main.
o The heap is used for dynamic memory allocation, and is managed via
calls to new, delete, malloc, free, etc.
o The stack is used for local variables. Space on the stack is reserved for
local variables when they are declared ( at function entrance or
elsewhere, depending on the language ), and the space is freed up when
the variables go out of scope. Note that the stack is also used for function
return values, and the exact mechanisms of stack management may be
language specific.
o Note that the stack and the heap start at opposite ends of the process's
free space and grow towards each other. If they should ever meet, then
either a stack overflow error will occur, or else a call to new or malloc will
fail due to insufficient memory available.

When processes are swapped out of memory and later restored, additional
information must also be stored and restored. Key among them are the program
counter and the value of all program registers.

Processes can be any of the following states :

New - The process is in the stage of being created.

Ready - The process has all the resources available that it needs to run, but the CPU is not

currently working on this process's instructions.

Running - The CPU is working on this process's instructions.

Waiting - The process cannot run at the moment, because it is waiting for some resource to

become available or for some event to occur.

Terminated - The process has completed.

Process Control Block:

There is a Process Control Block for each process, enclosing all the information
about the process. It is a data structure, which contains the following :

Process State - It can be running, waiting etc.

Process ID and parent process ID.
CPU registers and Program Counter. Program Counter holds the address of the
next instruction to be executed for that process.
CPU Scheduling information - Such as priority information and pointers to
scheduling queues.
Memory Management information - Eg. page tables or segment tables.
Accounting information - user and kernel CPU time consumed, account numbers,
limits, etc.
I/O Status information - Devices allocated, open file tables, etc.

Process Scheduling:
Maximize CPU use, quickly switch processes onto CPU for time sharing
Process scheduler selects among available processes for next execution
on CPU
Maintains scheduling queues of processes

Job queue set of all processes in the system

Ready queue set of all processes residing in main memory, ready and waiting
to execute
Device queues set of processes waiting for an I/O device
Processes migrate among the various queues

Schedulers

Long-term scheduler (or job scheduler)selects which processes should be

brought into the ready queue
Short-term scheduler (or CPU scheduler) selects which process should be
executed next and allocates CPU

Sometimes the only scheduler in a system

Short-term scheduler is invoked very frequently (milliseconds) => (must be fast)

Long-term scheduler is invoked very infrequently (seconds, minutes) => (may
be slow)
The long-term scheduler controls the degree of multiprogramming

Processes can be described as either:

I/O-bound process spends more time doing I/O than computations, many
short CPU bursts
CPU-bound process spends more time doing computations; few very long
CPU bursts

Context Switch

When CPU switches to another process, the system must save the state of the
old process and load the saved state for the new process via a context switch.
Context of a process represented in the PCB
Context-switch time is overhead; the system does no useful work while
switching.The more complex the OS and the PCB -> longer the context switch
Time dependent on hardware support. Some hardware provides multiple sets of
registers per CPU -> multiple contexts loaded at once

Two Types of Scheduling:

1) Preemptive: In this all the Processes are executed by using some Amount
of Time of CPU. The Time of CPU is divided into the Number of Minutes and
Time of CPU divided into the Process by using Some Rules. if the time is
divided into equal interval than it is called Quantum Time in the Preemptive
Scheduling Jobs are Executed one by one according to the Scheduling
Techniques, But in this when the Higher Priority will Request for a Service. To
the CPU, then CPU will transfer the Control to the Request Job, Means the
Running job will wait for Some Time.
2) Non-Preemptive: In this No Time Scheduling is used and in this CPU will
be automatically free after Executing the Whole Process Means When the
Execution of the Process will Completed then the CPU will be Free. When two
or more Process are given then this will first Complete the Process and after
Completing the First Process, this will Automatically start the Second Process.
Non-Preemptive Scheduling means No scheduling then all the Jobs are
Executed One by One. And in this when the First Job will be Completed, after
that second Job will Started.

Preemptive Scheduling: We have various Techniques of Scheduling.

1) First Come First Serve: As the name Suggest, the Processes those are
Coming first, will be Executed first and Means CPU Will Creates a Queue,
means all the Process are Inserted into the Queue and the CPU will Perform all
the Process by using their Coming Order.. In this all the Process are arranged
by the CPU and After Executing a Single Process, then this will Automatically
Execute second Process by Picking up the next Process.
2) Shortest Job first: In this Scheduling, All the Process are Arranged into
their Size Means How Many Time a Process require, of CPU for Executing.
CPU Arrange all the Processes according to the Requirement Time. CPU
Executes the Processes by Examining the Time Required by Process. CPU
Prepare a queue in which all the Processes are arranged by using the Number
of Time Units Requires by the Process.
For Example if we want to Print a Page and move a Mouse on the Screen. So
that CPU will first Move the Mouse on the Screen. Then after that he will print a
Page. Because Job of printing Require a Lots of Time and Moving a Mouse is
just requires little Time of CPU.
3) Priority Scheduling: When the Process are Given, then Each Process have
a Priority means Some Preference issue. Which Job will be executed first, is
determined by the CPU. After examining the priority of the CPU. Each Process
takes different Time of CPU and also the Number of Inputs those are needed by
the CPU. So CPU Maintains the Priority Level after Examining the Total time
which a Process will consume. All the Processes are Arranged by using Some
Priority,. Then CPU Executes the Process by using the Process Priority.
4) Round Robin: In this Scheduling the Time of CPU is divided into the Equal
Parts and Assign to various Processes. In this Time of CPU is also known as
Quantum Time. In the Round Robin, when the time of First Process has
finished, then the CPU will execute the Second Process. But there also be
possibility that the Process doesnt End, up to The Time. So that if process
doesnt end at the End of Time. Then CPU uses the Context Switching, Means
CPU Record the State of Process. After executing the other Processes, he will
execute the First Process Again until the Process never ends.
5) Multilevel Queue Scheduling: In this The Time of CPU is divided by using
Some Process Categories. In this the Process those are executed on the
Foreground or on the Screen, have a higher Priority and the Process those are
running in the Background to fill the Request the user. When we Input the data
into the Computer. Then the Data is displayed on the Screen after Processing.

Some useful facts about Scheduling Algorithms:

1) FCFS can cause long waiting times, especially when the first job takes too much
CPU time.
2) Both SJF and Shortest Remaining time first algorithms may cause starvation.
Consider a situation when long process is there in ready queue and shorter processes
keep coming.
3) If time quantum for Round Robin scheduling is very large, then it behaves same as
FCFS scheduling.
4) SJF is optimal in terms of average waiting time for a given set of processes, i.e.,
average waiting time is minimum with this scheduling, but problems is, how to
know/predict time of next job.

What is a Thread?
A thread is a path of execution within a process. Also, a process can contain multiple
threads.

Why Multithreading?
Thread is also known as lightweight process. The idea is achieve parallelism by
dividing a process into multiple threads. For example, in a browser, multiple tabs can
be different threads. MS word uses multiple threads, one thread to format the text,
other thread to process inputs etc.

Process vs Thread?
The typical difference is that threads within the same process run in a shared memory
space, while processes run in separate memory spaces.
Threads are not independent of one other like processes as a result threads shares
with other threads their code section, data section and OS resources like open files
and signals. But, like process, a thread has its own program counter (PC), a register
set, and a stack space.

Advantages of Thread over Process

1. Responsiveness: If the process is divided into multiple threads, if one thread
completed its execution, then its output can be immediately responded.
2. Faster context switch: Context switch time between threads is less compared to
process context switch. Process context switch is more overhead for CPU.
3. Effective Utilization of Multiprocessor system: If we have multiple threads in a single
process, then we can schedule multiple threads on multiple processor. This will make
process execution faster.
4. Resource sharing: Resources like code, data and file can be shared among all
threads within a process.
Note : stack and registers cant be shared among the threads. Each thread have its
own stack and registers.
5. Communication: Communication between multiple thread is easier as thread shares
common address space. while in process we have to follow some specific
communication technique for communication between two process.
6. Enhanced Throughput of the system: If process is divided into multiple threads and
each thread function is considered as one job, then the number of jobs completed per
unit time is increased. Thus, increasing the throughput of the system.

There are two types of thread.

User Level Thread
Kernel Level Thread.

USER LEVEL THREAD KERNEL LEVEL THREAD

User thread are implemented by users. kernel threads are implemented by OS.

OS doesnt recognized user level threads. Kernel threads are recognized by OS.

Implementation of User threads is easy. Implementation of Kernel thread is

complicated.

Context switch time is less. Context switch time is more.

Context switch requires no hardware Hardware support is needed.

support.

If one user level thread perform blocking If one kernel thread perform blocking

operation then entire process will be operation then another thread can continue

blocked. execution.

Example : Java thread, POSIX threads. Example : Window Solaris.

Thread libraries provides programmers with API for creating and managing of
threads.Thread libraries may be implemented either in user space or in kernel space.
The user space involves API functions implemented solely within user space, with no
kernel support. The kernel space involves system calls, and requires a kernel with
thread library support.

There are three types of thread :

POSIX Pitheads, may be provided as either a user or kernel library, as an
extension to the POSIX standard.
Win32 threads, are provided as a kernel-level library on Windows systems.
Java threads - Since Java generally runs on a Java Virtual Machine, the
implementation of threads is based upon whatever OS and hardware the JVM is
running on, i.e. either Pitheads or Win32 threads depending on the system

Process Synchronization: On the basis of synchronization, processes are

categorized as one of the following two types:

Independent Process : Execution of one process does not affects the execution
of other processes.
Cooperative Process : Execution of one process affects the execution of other
processes.
Process synchronization problem arises in the case of Cooperative process also
because resources are shared in Cooperative processes.

Critical Section Problem

Critical section is a code segment that can be accessed by only one process at a time.
Critical section contains shared variables which need to be synchronized to maintain
consistency of data variables.
In the entry section, the process requests for entry in the Critical Section.

Any solution to the critical section problem must satisfy three requirements:
Mutual Exclusion : If a process is executing in its critical section, then no other
process is allowed to execute in the critical section.
Progress : If no process is in the critical section, then no other process from
outside can block it from entering the critical section.
Bounded Waiting : A bound must exist on the number of times that other
processes are allowed to enter their critical sections after a process has made a
request to enter its critical section and before that request is granted.
Petersons Solution is a classical software based solution to the critical section
problem.
TestAndSet is a hardware solution to the synchronization problem. In TestAndSet, we
have a shared lock variable which can take either of the two values, 0 or 1.

Semaphore:
A semaphore is hardware or a software tag variable whose value indicates the status
of a common resource. Its purpose is to lock the resource being used. A process which
needs the resource will check the semaphore for determining the status of the
resource followed by the decision for proceeding. In multitasking operating systems,
the activities are synchronized by using the semaphore techniques.
There are two types of semaphores : Binary Semaphores and Counting
Semaphores
Binary Semaphores : They can only be either 0 or 1. They are also known as
mutex locks, as the locks can provide mutual exclusion. All the processes can
share the same mutex semaphore that is initialized to 1. Then, a process has to
wait until the lock becomes 0. Then, the process can make the mutex semaphore
1 and start its critical section. When it completes its critical section, it can reset
the value of mutex semaphore to 0 and some other process can enter its critical
section.
Counting Semaphores : They can have any value and are not restricted over a
certain domain. They can be used to control access a resource that has a
limitation on the number of simultaneous accesses. The semaphore can be
initialized to the number of instances of the resource. Whenever a process wants
to use that resource, it checks if the number of remaining instances is more than
zero, i.e., the process has an instance available. Then, the process can enter its
critical section thereby decreasing the value of the counting semaphore by 1.
After the process is over with the use of the instance of the resource, it can leave
the critical section thereby adding 1 to the number of available instances of the
resource.

Semaphores are commonly use for two purposes: to share a common memory space
and to share access to files. Semaphores are one of the techniques for interprocess
communication (IPC). The C programming language provides a set of interfaces or
"functions" for managing semaphores.

Properties of Semaphores

1. Simple
2. Works with many processes
3. Can have many different critical sections with different semaphores
4. Each critical section has unique access semaphores
5. Can permit multiple processes into the critical section at once, if desirable

Deadlock: Is it a state where two ore more operations are waiting for each other, say a
computing action 'A' is waiting for action 'B' to complete, while action 'B' can only
execute when 'A' is completed. Such a situation would be called a deadlock. In
operating systems, a deadlock situation is arrived when computer resources required
for complete of a computing task are held by another task that is waiting to execute.
The system thus goes into an indefinite loop resulting into a deadlock.The deadlock in
operating system seems to be a common issue in multiprocessor systems, parallel and
distributed computing setups.
The resources may be either physical or logical. Examples of physical resources are
Printers, Tape Drivers, Memory Space, and CPU Cycles. Examples of logical
resources are Files, Semaphores, and Monitors.The simplest example of deadlock is
where process 1 has been allocated non-shareable resources A, say, a tap drive, and
process 2 has be allocated non-sharable resource B, say, a printer. Now, if it turns out
that process 1 needs resource B (printer) to proceed and process 2 needs
resource A (the tape drive) to proceed and these are the only two processes in the
system, each is blocked the other and all useful work in the system stops. This
situation ifs termed deadlock. The system is in deadlock state because each process
holds a resource being requested by the other process neither process is willing to
release the resource it holds.Resources come in two flavors: preemptable and non
preemptable. A preemptable resource is one that can be taken away from the process
with no ill effects. Memory is an example of a preemptable resource. On the other
hand, a non preemptable resource is one that cannot be taken away from process
(without causing ill effect). For example, CD resources are not preemptable at an
arbitrary moment.Reallocating resources can resolve deadlocks that involve
preemptable resources. Deadlocks that involve non preemptable resources are difficult
to deal with.

In order for deadlock to occur, four conditions must be true.

Mutual exclusion - Each resource is either currently allocated to exactly one

process or it is available. (Two processes cannot simultaneously control the
same resource or be in their critical section).
Hold and Wait - processes currently holding resources can request new
resources
No preemption - Once a process holds a resource, it cannot be taken away by
another process or the kernel.
Circular wait - Each process is waiting to obtain a resource which is held by
another process.

Following three strategies can be used to remove deadlock after its occurrence:

1. PreemptionWe can take a resource from one process and give it to other. This
will resolve the deadlock situation, but sometimes it does causes problems.
2. RollbackIn situations where deadlock is a real possibility, the system can
periodically make a record of the state of each process and when deadlock
occurs, roll everything back to the last checkpoint, and restart, but allocating
resources differently so that deadlock does not occur.
3. Kill one or more processesThis is the simplest way, but it works.

Livelock: A situation in which two or more processes continuously change their states
in response to changes in the other process(es) without doing any useful work. It is
somewhat similar to the deadlock but the difference is processes are getting polite and
let other to do the work. This can be happen when a process trying to avoid a
deadlock.
Dijkstra Banking Algorithm :

The Banker's Algorithm is a strategy for deadlock prevention. In an operating system,

deadlock is a state in which two or more processes are "stuck" in a circular wait state.
All deadlocked processes are waiting for resources held by other processes. Because
most systems are non-preemptive (that is, will not take resources held by a process
away from it), and employ a hold and wait method for dealing with system resources
(that is, once a process gets a certain resource it will not give it up voluntarily),
deadlock is a dangerous state that can cause poor system performance.

One reason this algorithm is not widely used in the real world is because to use it the
operating system must know the maximum amount of resources that every process is
going to need at all times. Therefore, for example, a just-executed program must
declare up-front that it will be needing no more than, say, 400K of memory. The
operating system would then store the limit of 400K and use it in the deadlock
avoidance calculations.The Banker's Algorithm seeks to prevent deadlock by
becoming involved in the granting or denying of system resources. Each time that a
process needs a particular non-sharable resource, the request must be approved by
the banker.

Memory Management:
Main Memory refers to a physical memory that is the internal memory to the computer.
The word main is used to distinguish it from external mass storage devices such as
disk drives. Main memory is also known as RAM. The computer is able to change only
data that is in main memory. Therefore, every program we execute and every file we
access must be copied from a storage device into main memory.
All the programs are loaded in the main memeory for execution. Sometimes complete
program is loaded into the memory, but some times a certain part or routine of the
program is loaded into the main memory only when it is called by the program, this
mechanism is called Dynamic Loading, this enhance the performance.Also, at times
one program is dependent on some other program. In such a case, rather than loading
all the dependent programs, CPU links the dependent programs to the main executing
program when its required. This mechanism is known as Dynamic Linking.
Swapping
Swapping is a simple memory/process management technique used by the operating
system(os) to increase the utilization of the processor by moving some blocked
process from the main memory to the secondary memory(hard disk);thus forming a
queue of temporarily suspended process and the execution continues with the newly
arrived process.After performing the swapping process,the operating system has two
options in selecting a process for execution :Operating System can admit newly
created process or operating system can activate suspended process from the swap
memory.
If you have ever used any Linux based operating system then at the time of installation
Did you see an options/warning for the need of swap memory space?? If you have
enough primary memory(RAM) e.g greater than 2GB then you may need not any
swapping memory space for desktop users(I am using Ubuntu 10.04 LTS and total
RAM is 4GB so I am not feeling any trouble without swap memory space) and some
times using swap memory may slow down your computer performance.

Contiguous Memory Allocation:

In contiguous memory allocation each process is contained in a single contiguous
block of memory. Memory is divided into several fixed size partitions. Each partition
contains exactly one process. When a partition is free, a process is selected from the
input queue and loaded into it. The free blocks of memory are known as holes. The set
of holes is searched to determine which hole is best to allocate.
Memory allocation is a process by which computer programs are assigned
memory or space. It is of three types :

First Fit
The first hole that is big enough is allocated to program.

Best Fit
The smallest hole that is big enough is allocated to program.

Worst Fit
The largest hole that is big enough is allocated to program.

Fragmentation occurs in a dynamic memory allocation system when most of the free
blocks are too small to satisfy any request. It is generally termed as inability to use the
available memory.In such situation processes are loaded and removed from the
memory. As a result of this, free holes exists to satisfy a request but is non contiguous
i.e. the memory is fragmented into large no. Of small holes. This phenomenon is
known as External Fragmentation. Also, at times the physical memory is broken into
fixed size blocks and memory is allocated in unit of block sizes. The memory allocated
to a space may be slightly larger than the requested memory. The difference between
allocated and required memory is known as Internal fragmentation i.e. the memory
that is internal to a partition but is of no use.

Paging Computer memory is divided into small partitions that are all the same size
and referred to as, page frames. Then when a process is loaded it gets divided into
pages which are the same size as those previous frames. The process pages are then
loaded into the frames. A Page Table is the data structure used by a virtual memory
system in a computer operating system to store the mapping between virtual
address and physical addresses.Virtual address is also known as Logical address and
is generated by the CPU. While Physical address is the address that actually exists on
memory.

Segmentation Computer memory is allocated in various sizes (segments)

depending on the need for address space by the process. These segments may be
individually protected or shared between processes. Commonly you will see what are
called Segmentation Faults in programs, this is because the data thats is about to be
read or written is outside the permitted address space of that process.

Difference b/w Paging and Segmentation:

Paging:
Transparent to programmer (system allocates memory)
No separate protection
No separate compiling
No shared code

Segmentation:
Involves programmer (allocates memory to specific function inside code)
Separate compiling
Separate protection
Share code

Each segment in this scheme is divided into pages and each segment is
maintained in a page table. So the logical address is divided into following 3
parts :

Segment numbers(S)
Page number (P)
The displacement or offset number (D)

Virtual Memory
Virtual memory is an approach to make use of the secondary storage devices as an
extension of the primary storage of the computer.It is the process of increasing the
apparent size of a computer's RAM by using a section of the hard disk storage as an
extension of RAM. Logically-assigned memory that may or may not exist physically.
Through the use of paging and the swap area, more memory can be referenced and
allocated than actually exists on the system, thus giving the appearance of a larger
main memory than actually exists. Virtual memory is commonly implemented by
demand paging. It can also be implemented in a segmentation system. Demand
segmentation can also be used to provide virtual memory.
Benefits of having Virtual Memory :
1. Large programs can be written, as virtual space available is huge compared to
physical memory.
2. Less I/O required, leads to faster and easy swapping of processes.
3. More physical memory available, as programs are stored on virtual memory, so
they occupy very less space on actual physical memory.

Demand Paging
A demand paging system is quite similar to a paging system with swapping where
processes reside in secondary memory and pages are loaded only on demand, not in
advance. When a context switch occurs, the operating system does not copy any of
the old programs pages out to the disk or any of the new programs pages into the
main memory Instead, it just begins executing the new program after loading the first
page and fetches that programs pages as they are referenced.

While executing a program, if the program references a page which is not available in
the main memory because it was swapped out a little ago, the processor treats this
invalid memory reference as a page fault and transfers control from the program to
the operating system to demand the page back into the memory.

Advantages
Following are the advantages of Demand Paging

Large virtual memory.

More efficient use of memory.
There is no limit on degree of multiprogramming.
Disadvantages
Number of tables and the amount of processor overhead for handling page
interrupts are greater than in the case of the simple paged management
techniques.

Page Replacement Algorithm

Page replacement algorithms are the techniques using which an Operating System
decides which memory pages to swap out, write to disk when a page of memory
needs to be allocated. Paging happens whenever a page fault occurs and a free page
cannot be used for allocation purpose accounting to reason that pages are not
available or the number of free pages is lower than required pages.When the page
that was selected for replacement and was paged out, is referenced again, it has to
read in from disk, and this requires for I/O completion. This process determines the
quality of the page replacement algorithm: the lesser the time waiting for page-ins, the
better is the algorithm.

First In First Out

This is the simplest page replacement algorithm. In this algorithm, operating system
keeps track of all pages in the memory in a queue, oldest page is in the front of the
queue. When a page needs to be replaced page in the front of the queue is selected
for removal.
Beladys anomaly
Beladys anomaly proves that it is possible to have more page faults when increasing
the number of page frames while using the First in First Out (FIFO) page replacement
algorithm.
Optimal Page replacement
In this algorithm, pages are replaced which are not used for the longest duration of
time in the future. Optimal page replacement is perfect, but not possible in practice as
operating system cannot know future requests. The use of Optimal Page replacement
is to set up a benchmark so that other replacement algorithms can be analyzed against
it.

Least Recently Used (LRU) algorithm

Page which has not been used for the longest time in main memory is the one
which will be selected for replacement.

Easy to implement, keep a list, replace pages by looking back into time.

Page Buffering algorithm

To get a process start quickly, keep a pool of free frames.

On page fault, select a page to be replaced.
Write the new page in the frame of free pool, mark the page table and restart
the process.
Now write the dirty page out of disk and place the frame holding replaced page
in free pool.
Least frequently Used(LFU) algorithm
The page with the smallest count is the one which will be selected for
replacement.
This algorithm suffers from the situation in which a page is used heavily during
the initial phase of a process, but then is never used again.

Thrashing: A process that is spending more time paging than executing is said to be
thrashing. In other words it means, that the process doesn't have enough frames to
hold all the pages for its execution, so it is swapping pages in and out very frequently
to keep executing. Sometimes, the pages which will be required in the near future have
to be swapped out.Initially when the CPU utilization is low, the process scheduling
mechanism, to increase the level of multiprogramming loads multiple processes into
the memory at the same time, allocating a limited amount of frames to each process.
As the memory fills up, process starts to spend a lot of time for the required pages to
be swapped in, again leading to low CPU utilization because most of the proccesses
are waiting for pages. Hence the scheduler loads more processes to increase CPU
utilization, as this continues at a point of time the complete system comes to a stop.

FILE DIRECTORIES:
Collection of files is a file directory. The directory contains information about the files,
including attributes, location and ownership. Much of this information, especially that is
concerned with storage, is managed by the operating system. The directory is itself a
file, accessible by various file management routines.
SINGLE-LEVEL DIRECTORY
In this a single directory is maintained for all the users.
Naming problem: Users cannot have same name for two files.
Grouping problem: Users cannot group files according to their need.

TWO-LEVEL DIRECTORY

In this separate directories for each user is maintained.

Path name:Due to two levels there is a path name for every file to locate that file.
Now,we can have same file name for different user.
Searching is efficient in this method.

The way that files are accessed and read into memory is determined by Access
methods. Usually a single access method is supported by systems while there are
OS's that support multiple access methods.
Sequential Access

Data is accessed one record right after another is an order.

Read command cause a pointer to be moved ahead by one.
Write command allocate space for the record and move the pointer to the new End
Of File.
Such a method is reasonable for tape.

Direct Access

This method is useful for disks.

The file is viewed as a numbered sequence of blocks or records.
There are no restrictions on which blocks are read/written, it can be dobe in any
order.
User now says "read n" rather than "read next".
"n" is a number relative to the beginning of file, not relative to an absolute physical
disk location.

Indexed Sequential Access

It is built on top of Sequential access.
It uses an Index to control the pointer while accessing files.

Files are allocated disk spaces by operating system. Operating systems deploy
following three main ways to allocate disk space to files.

Contiguous Allocation
Linked Allocation
Indexed Allocation
Contiguous Allocation

Each file occupies a contiguous address space on disk.

Assigned disk address is in linear order.
Easy to implement.
External fragmentation is a major issue with this type of allocation technique.
Linked Allocation

Each file carries a list of links to disk blocks.

Directory contains link / pointer to first block of a file.
No external fragmentation
Effectively used in sequential access file.
Inefficient in case of direct access file.
Indexed Allocation

Provides solutions to problems of contiguous and linked allocation.

A index block is created having all pointers to files.
Each file has its own index block which stores the addresses of disk space
occupied by the file.
Directory contains the addresses of index blocks of files.

FCFS Scheduling Algorithm:

First-Come-First-Served algorithm is the simplest scheduling algorithm is the simplest
scheduling algorithm. Processes are dispatched according to their arrival time on the
ready queue. Being a nonpreemptive discipline, once a process has a CPU, it runs to
completion. The FCFS scheduling is fair in the formal sense or human sense of
fairness but it is unfair in the sense that long jobs make short jobs wait and
unimportant jobs make important jobs wait.FCFS is more predictable than most of
other schemes since it offers time. FCFS scheme is not useful in scheduling interactive
users because it cannot guarantee good response time. The code for FCFS
scheduling is simple to write and understand. One of the major drawback of this
scheme is that the average time is often quite long.The First-Come-First-Served
algorithm is rarely used as a master scheme in modern operating systems but it is
often embedded within other schemes.

SJF Scheduling Algorithm(Shortest-Process-Next (SPN)

Shortest-Job-First (SJF) is a non-preemptive discipline in which waiting job (or

process) with the smallest estimated run-time-to-completion is run next. In other words,
when CPU is available, it is assigned to the process that has smallest next CPU burst.
The SJF scheduling is especially appropriate for batch jobs for which the run times are
known in advance. Since the SJF scheduling algorithm gives the minimum average
time for a given set of processes, it is probably optimal.The SJF algorithm favors short
jobs (or processors) at the expense of longer ones.

SRT Scheduling Algorithm(Shortest Remaining Time)

The SRT is the preemtive counterpart of SJF and useful in time-sharing

environment.
In SRT scheduling, the process with the smallest estimated run-time to
completion is run next, including new arrivals.
The algorithm SRT has higher overhead than its counterpart SJF.
The SRT must keep track of the elapsed time of the running process and must
handle occasional preemptions.
In this scheme, arrival of small processes will run almost immediately. However,
longer jobs have even longer mean waiting time.

Mutual Exclusing

A way of making sure that if one process is using a shared modifiable data, the other
processes will be excluded from doing the same thing.Formally, while one process
executes the shared variable, all other processes desiring to do so at the same time
moment should be kept waiting; when that process has finished executing the shared
variable, one of the processes waiting; while that process has finished executing the
shared variable, one of the processes waiting to do so should be allowed to proceed.
In this fashion, each process executing the shared data (variables) excludes all others
from doing so simultaneously. This is called Mutual Exclusion.

Note that mutual exclusion needs to be enforced only when processes access shared
modifiable data - when processes are performing operations that do not conflict with
one another they should be allowed to proceed concurrently.

Mutual Exclusion Conditions

If we could arrange matters such that no two processes were ever in their critical
sections simultaneously, we could avoid race conditions. We need four conditions to
hold to have a good solution for the critical section problem (mutual exclusion).

No two processes may at the same moment inside their critical sections.
No assumptions are made about relative speeds of processes or number of
CPUs.
No process should outside its critical section should block other processes.
No process should wait arbitrary long to enter its critical section.

System Call: System calls provide an interface between the process and the
operating system.
System calls allow user-level processes to request some services from the
operating system which process itself is not allowed to do.
In handling the trap, the operating system will enter in the kernel mode, where it
has access to privileged instructions, and can perform the desired service on
the behalf of user-level process.
It is because of the critical nature of operations that the operating system itself
does them every time they are needed.
For example, for I/O a process involves a system call telling the operating
system to read or write particular area and this request is satisfied by the
operating system.
Types of System calls
Process control
File management
Device management
Information maintenance
Communications
1) Process Control:
A running program needs to be able to stop execution either normally or
abnormally.
When execution is stopped abnormally, often a dump of memory is taken and
can be examined with a debugger.
Following are functions of process control:
i. end, abort
ii. load, execute
iii. create process, terminate process
iv. get process attributes, set process attributes
v. wait for time
vi. wait event, signal event
vii. allocate and free memory
2) File management :
We first need to be able to create and delete files. Either system call requires
the name of the file and perhaps some of the file's attributes.
Once the file is created, we need to open it and to use it. We may also read,
write, or reposition. Finally, we need to close the file, indicating that we are no
longer using it.
We may need these same sets of operations for directories if we have a
directory structure for organizing files in the file system.
In addition, for either files or directories, we need to be able to determine the
values of various attributes and perhaps to reset them if necessary. File
attributes include the file name, a file type, protection codes, accounting
information, and so on
Functions:
o create file, delete file
o open, close file
o read, write, reposition
o get and set file attributes
3) Device Management:
A process may need several resources to execute - main memory, disk drives,
access to files, and so on. If the resources are available, they can be granted,
and control can be returned to the user process. Otherwise, the process will
have to wait until sufficient resources are available.
The various resources controlled by the OS can be thought of as devices. Some
of these devices are physical devices (for example, tapes), while others can be
thought of as abstract or virtual devices (for example, files).
Once the device has been requested (and allocated to us), we can read, write,
and (possibly) reposition the device, just as we can with files.
In fact, the similarity between I/O devices and files is so great that many OSs,
including UNIX, merge the two into a combined file-device structure.
A set of system calls is used on files and devices. Sometimes, 1/0 devices are
identified by special file names, directory placement, or file attributes.
Functions:
o request device, release device
o read, write, reposition
o get device attributes, set device attributes
o logically attach or detach devices
Information Maintenance
Many system calls exist simply for the purpose of transferring information
between the user program and the OS. For example, most systems have a
system call to return the current time and date.
Other system calls may return information about the system, such as the
number of current users, the version number of the OS, the amount of free
memory or disk space, and so on.
In addition, the OS keeps information about all its processes, and system calls
are used to access this information. Generally, calls are also used to reset the
process information.
Functions:
get time or date, set time or date
get system data, set system data
get and set process, file, or device attributes
Communication
There are two common models of interprocess communication: the message-
passing model and the shared-memory model. In the message-passing model,
the communicating processes exchange messages with one another to transfer
information.
In the shared-memory model, processes use shared memory creates and
shared memory attaches system calls to create and gain access to regions of
memory owned by other processes.
Recall that, normally, the OS tries to prevent one process from accessing
another process's memory. Shared memory requires that two or more
processes agree to remove this restriction. They can then exchange information
by reading and writing data in the shared areas.
Message passing is useful for exchanging smaller amounts of data, because no
conflicts need be avoided. It is also easier to implement than is shared memory
for intercomputer communication.
Shared memory allows maximum speed and convenience of communication,
since it can be done at memory speeds when it takes place within a computer.
Problems exist, however, in the areas of protection and synchronization
between the processes sharing memory.
Functions:
o create, delete communication connection
o send, receive messages
o transfer status information
o Attach and Detach remote devices
The fork() system call is used to create processes. When a process (a program
in execution) makes a fork() call, an exact copy of the process is created. Now
there are two processes, one being the parent process and the other being
the child process.The process which called the fork() call is the parent process
and the process which is created newly is called the child process. The child
process will be exactly the same as the parent. Note that the process state of the
parent i.e., the address space, variables, open files etc. is copied into the child
process. This means that the parent and child processes have identical but
physically different address spaces. The change of values in parent process
doesnt affect the child and vice versa is true too.Both processes start execution
from the next line of code i.e., the line after the fork() call. The exec() system call
is also used to create processes. But there is one big difference
between fork() and exec() calls. The fork() call creates a new process while
preserving the parent process. But, an exec() call replaces the address space,
text segment, data segment etc. of the current process with the new process.It
means, after an exec() call, only the new process exists. The process which
made the system call, wouldnt exist.

Device Controller

Device drivers are software modules that can be plugged into an OS to handle a
particular device. Operating System takes help from device drivers to handle all I/O
devices.The Device Controller works like an interface between a device and a device
driver. I/O units (Keyboard, mouse, printer, etc.) typically consist of a mechanical
component and an electronic component where electronic component is called the
device controller.

There is always a device controller and a device driver for each device to
communicate with the Operating Systems. A device controller may be able to handle
multiple devices. As an interface its main task is to convert serial bit stream to block of
bytes, perform error correction as necessary.

Any device connected to the computer is connected by a plug and socket, and the
socket is connected to a device controller. Following is a model for connecting the
CPU, memory, controllers, and I/O devices where CPU and device controllers all use
a common bus for communication.

Polling and Interrupts I/O:

A computer must have a way of detecting the arrival of any type of input. There are
two ways that this can happen, known as polling and interrupts. Both of these
techniques allow the processor to deal with events that can happen at any time and
that are not related to the process it is currently running.

Polling I/O
Polling is the simplest way for an I/O device to communicate with the processor. The
process of periodically checking status of the device to see if it is time for the next I/O
operation, is called polling. The I/O device simply puts the information in a Status
register, and the processor must come and get the information. Most of the time,
devices will not require attention and when one does it will have to wait until it is next
interrogated by the polling program. This is an inefficient method and much of the
processors time is wasted on unnecessary polls.

Compare this method to a teacher continually asking every student in a class, one
after another, if they need help. Obviously the more efficient method would be for a
student to inform the teacher whenever they require assistance.

Interrupts I/O
An alternative scheme for dealing with I/O is the interrupt-driven method. An interrupt
is a signal to the microprocessor from a device that requires attention. A device
controller puts an interrupt signal on the bus when it needs CPUs attention when
CPU receives an interrupt, It saves its current state and invokes the appropriate
interrupt handler using the interrupt vector (addresses of OS routines to handle
various events). When the interrupting device has been dealt with, the CPU continues
with its original task as if it had never been interrupted.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
UNIX/Linux

UNIX is an operating system which was first developed in the 1960s, and has
been under constant development ever since. By operating system, we mean the
suite of programs which make the computer work. It is a stable, multi-user, multi-
tasking system for servers, desktops and laptops.

UNIX systems also have a graphical user interface (GUI) similar to Microsoft
Windows which provides an easy to use environment. However, knowledge of
UNIX is required for operations which aren't covered by a graphical program, or for
when there is no windows interface available, for example, in a telnet session.

There are many different versions of UNIX, although they share common
similarities. The most popular varieties of UNIX are Sun Solaris, GNU/Linux, and
MacOS X. Redhat is the most popular distribution because it has been ported to a
large number of hardware platforms (including Intel, Alpha, and SPARC), it is easy
to use and install and it comes with a comprehensive set of utilities and applications
including the X Windows graphics system, GNOME and KDE GUI environments,
and the StarOffice suite (an open source MS-Office clone for Linux).

Kernel component code executes in a special privileged mode called kernel

mode with full access to all resources of the computer. This code represents a single
process, executes in single address space and do not require any context switch and
hence is very efficient and fast. Kernel runs each processes and provides system
services to processes, provides protected access to hardware to processes.

Support code which is not required to run in kernel mode is in System Library. User
programs and other system programs works in User Mode which has no access to
system hardware and kernel code. User programs/ utilities use System libraries to
access Kernel functions to get system's low level tasks.

Linux has all of the components of a typical OS:

Kernel
The Linux kernel includes device driver support for a large number of PC
hardware devices (graphics cards, network cards, hard disks etc.), advanced
processor and memory management features, and support for many different
types of filesystems (including DOS floppies and the ISO9660 standard for
CDROMs). In terms of the services that it provides to application programs and
system utilities, the kernel implements most BSD and SYSV system calls, as
well as the system calls described in the POSIX.1 specification.

The kernel (in raw binary form that is loaded directly into memory at system
startup time) is typically found in the file /boot/vmlinuz, while the source files can
usually be found in /usr/src/linux.

Shells and GUIs

Linux supports two forms of command input: through textual command line
shells similar to those found on most UNIX systems (e.g. sh - the Bourne shell,
bash - the Bourne again shell and csh - the C shell) and through graphical
interfaces (GUIs) such as the KDE and GNOME window managers. If you are
connecting remotely to a server your access will typically be through a
command line shell.

System Utilities
Virtually every system utility that you would expect to find on standard
implementations of UNIX (including every system utility described in the
POSIX.2 specification) has been ported to Linux. This includes commands such
as ls, cp, grep, awk, sed, bc, wc, more, and so on. These system utilities are
designed to be powerful tools that do a single task extremely well
(e.g. grep finds text inside files while wc counts the number of words, lines and
bytes inside a file). Users can often solve problems by interconnecting these
tools instead of writing a large monolithic application program.

Application programs
Linux distributions typically come with several useful application programs as
standard. Examples include the emacs editor, xv (an image viewer), gcc (a C
compiler), g++ (a C++ compiler), xfig (a drawing package), latex (a powerful
typesetting language) and soffice (StarOffice, which is an MS-Office style clone
that can read and write Word, Excel and PowerPoint files).

Redhat Linux also comes with rpm, the Redhat Package Manager which makes
it easy to install and uninstall application programs.

Logging into and out of UNIX System:

Text-based (TTY) terminals:

When you connect to a UNIX computer remotely (using telnet) or when you log in
locally using a text-only terminal, you will see the prompt:
login:

At this prompt, type in your usename and press the enter/return/ key. Remember
that UNIX is case sensitive (i.e. Will, WILL and will are all different logins). You should
then be prompted for your password:

Type your password in at the prompt and press the enter/return/ key. Note that
your password will not be displayed on the screen as you type it in.

If you mistype your username or password you will get an appropriate message from
the computer and you will be presented with the login: prompt again. Otherwise you
should be presented with a shell prompt which looks something like this:

To log out of a text-based UNIX shell, type "exit" at the shell prompt (or if that doesn't
work try "logout"; if that doesn't work press ctrl-d).

Graphical terminals:

If you're logging into a UNIX computer locally, or if you are using a remote login facility
that supports graphics, you might instead be presented with a graphical prompt with
login and password fields. Enter your user name and password in the same way as
above (N.B. you may need to press the TAB key to move between fields).

Once you are logged in, you should be presented with a graphical window manager
that looks similar to the Microsoft Windows interface. To bring up a window containing
a shell prompt look for menus or icons which mention the words "shell", "xterm",
"console" or "terminal emulator".

To log out of a graphical window manager, look for menu options similar to "Log out" or
"Exit".

Linux Commands

These commands will work with most (if not all) distributions of Linux as well as most
(?) implementations of Unix. They're the commands that everybody knows. To be able
to survive in Linux, you should know these. There aren't always handy-dandy tools for
X that shield you, especially if you're managing your own system, stuff often goes
wrong and you're forced to work with the bare minimum.
1. Navigation - how to get around
o cd - changing directories
o ls - listing files
o pwd - knowing where you are
2. File Management - who needs a graphical file manager?
o cp - copying files
o ln - creating symbolic links
o mv - moving and renaming files
o rm - removing files
3. Editing - using text editors for those nasty configuration files
o emacs - another widely used text editor
o pico - for wussies like myself
o vim - an improved version of the standard Unix text editor
4. Monitoring Your System - to satisfy your insatiable curiosity
o tail - follow a file as it grows
o top - a program to see how your memory and CPU are holding up
o w - look at who's logged on

Navigation

Navigating around the files and directories of your hard drive could be a dreaded task
for you, but it is necessary knowledge. If you were a user of command prompt
interfaces such as MS-DOS, you'll have little trouble adjusting. You'll only need to learn
a few new commands. If you're used to navigating using a graphical file manager, I
don't know how it'll be like, but some concepts might require a little more clarification.
Or maybe it'll be easier for you. Who knows? Everyone is different.

As you might already have guessed, the cd command changes directories. It's a very
common navigation command that you'll end up using, just like you might have done in
MS-DOS.

You must put a space between cd and the ".." or else it won't work; Linux doesn't see
the two dots as an extension to the cd command, but rather a different command
altogether. It'll come to make sense if it doesn't already.

The ls letters stand for list. It basically works the same way as the dir command in
DOS. Only being a Unix command, you can do more with it. :-)

Typing ls will give you a listing of all the files in the current directory. If you're new to
Linux, chances are that the directories you are commonly in will be empty, and after
the ls command is run, you aren't given any information and will just be returned to the
command prompt (the shell).

There are "hidden" files in Linux, too. Their file names start with a dot, and doing a
normal ls won't show them in a directory. Many configuration files start with a dot on
their file names because they would only get in the way of users who would like to see
more commonly used items. To view hidden files, use the -a flag with the ls command,
i.e. ls -a.

To view more information about the files in a directory, use the -l flag with ls. It will
show the file permissions as well as the file size, which are probably what are the most
useful things to know about files.

You might occasionally want to have a listing of all the subdirectories, also. A simple -
R flag will do, so you could look upon ls -R as a rough equivalent of the dir /s
command in MS-DOS.

You can put flags together, so to view all the files in a directory, show their
permissions/size, and view all the files that way through the subdirectories, you could
type ls -laR.

pwd

This command simply shows what directory you're in at the moment. It stands for "Print
Working Directory". It's useful for scripting in case you might ever want to refer to your
current directory.
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

File Management

A lot of people, surprisingly for me, prefer to use graphical file managers. Fortunately
for me, I wasn't spoiled like that and used commands in DOS. That made it a bit easier
for me to make the transition to Linux. Most of the file management Linux gurus do is
through the command line, so if you learn to use the commands, you can brag that
you're a guru. Well, almost.

Copying works very much the same. The cp command can be used just like the MS-
DOS copy command, only remember that directories are separated with slashes (/)
instead of backslashes (\). So a basic command line is just cp filename1 filename2.

There are other extensions to the cp command. You can use the -f command to force
it. You can use the -p command to preserve the permissions (and also who owns the
file, but I'm not sure).

You can move an entire directory to its new destination. Let's say you want to copy a
directory (and all of its contents) from where you are to be /home/jack/newdirectory/.
You would type cp -rpf olddirectory /home/jack/newdirectory. To issue this
command you would have to be in the directory where the subdirectory "olddirectory"
is actually located.

A feature of linking files is available in Linux. It works by "redirecting" a file to the

actual file. It's referred to as a symbolic link. Don't confuse this term with the linking of
programs, which is when binary programs are connected with libraries that they need
to load in order to run.

The most simple way that I've ever used ln to create symbolic links is ln -s
existing_file link. Evidently there's a hard link and a symbolic link; I've been using a
symbolic link all along. You can also use the -f flag to force the command line to
overwrite anything that might have the symbolic link's file name already.

To remove a symbolic link, simply type rm symbolic_link. It won't remove the file that
it's linked to.

The mv command can be used both to move files and to rename them. The syntax is
mv fileone filetwo, where "fileone" is the original file name and "filetwo" will be the
new file name.

You can't move a directory that is located in one partition to another, unfortunately.
You can copy it, though, using cp -rpf, and then remove it with rm -rf later on. If you
have only a single partition that makes up your filesystem then you have very little to
worry about in this area.

The rm command is used for removing files. You use it just like the del or delete
command in MS-DOS. Let's say you want to remove a file called foobar in your current
directory. To do that, simply type rm foobar. Note that there is no "Recycle Bin" like in
Windows 95. So when you delete a file, it's gone for good.

To delete something in some other directory, use the full path as the file name. For
example, if you want to delete a file called "windows" that's in the directory
/usr/local/src/, you would type rm /usr/local/src/windows.

To remove an entire directory and its contents, type rm -rf /directory where
"/directory" is the path to the directory that you want to delete. If you're wondering, the
"rf" stands for "recursive" and "force". Be very careful with this command, as it can
wreak havoc easily if misused.

Editing

If you haven't figured out how important a text editor is, you soon will. Graphical
interfaces can't shield you forever, and those utilities have their limits. Besides, if
you're reading this page, I'm inclined to think that you want to be able to customize
beyond the capabilities of graphical utilities. You want to work at the command prompt.
I know you do.

The basic syntax to invoke these text editors is the same. Type the name of the editor
followed by the file you want to edit, separated by a space in between. Non-existent
files will be blank. Blank files will be blank as well.
emacs

To use GNU Emacs (or its counterpart, XEmacs), there are really only two commands
you need to know. Heck, they're the only ones I know.

While you're editing a certain file with emacs or xemacs, you can save it with the
[Ctrl]-x [Ctrl]-s keystrokes. Then to exit, type [Ctrl]-x [Ctrl]-c.

pico

The instructions for using pico are located on the screen. You save the file by using
the [Ctrl]-o keystroke (for write-out) and exit with [Ctrl]-x.

As a permanent solution, you probably don't want to use pico. It lacks real power.
Since I am such a wuss, however, I still have the bad habit of using pico once in a
while. Why? By pressing [Ctrl]j I can get entire paragraphs wrapped into a nice
justified block. I don't know how to do that with the other text editors.

vim

Most modern distributions include vim, derived from the infamously arcane Unix editor,
vi. (It stands for vi Improved, as a matter of fact.)

Using vim is different in that there are several modes in which you use it. To do actual
editing of the files, press [ESC] i (both separately). Then to save it, press [ESC] : w.
Escape, the colon, and "w" should be keyed in one after the other. Finally, to quit, type
[ESC] : q. The same rules apply as in previous vim commands.

You can use "w" and "q" at the same time to enable yourself to write to the file and
then quit right afterwards. Just press [ESC] : w q.

If you don't have vim installed, try vi instead.

Monitoring Your System

An important part of system administration (especially with your own system) is being
able to know what's going on.

tail

The program tail allows you to follow a file as it is growing. Most often, I use it to follow
/var/log/messages. I do that by typing tail -f /var/log/messages. Of course, you can
use anything else, including the other logs in /var/log/. Another file you may want to
keep an eye out for is /var/log/secure.
If you want to leave that running all the time, I recommend having some sort of
terminal program in X, logged in as root through su.

Another program you may want to look at is head. It monitors the top of the file
specified, instead of the bottom.

top

This program shows a lot of stuff that goes on with your system. In the program, you
can type:

1. M for memory usage information

2. P for CPU information
3. q to quit
Once you try it, you can see that top shows you the memory usage, uptime, load
average, CPU states, and processes.

Typing w will tell you who is logged in. This can be helpful if you're the only one who
uses your computer and you see someone logged in that's not supposed to be.

Another alternative is who.

Shutting Down and Rebooting

To shut down your system, type shutdown -h now, which tells the shutdown program
to begin system halt immediately. You can also tell it to halt the system at a later time, I
think, but you'll have to consult the shutdown manual page for that (man shutdown).

To do a reboot, you can either type reboot or shutdown -r. You can also use the
famous Ctrl-Alt-Delete combination to reboot, which you might already be familiar with.

Shutting down and restarting properly (as described above) will prevent your filesystem
from being damaged. Filesystem damage is the most obvious of the consequences,
but there are probably other things out there that I don't know about. The point is, shut
down your system properly.

There are (rare!) cases in which the machine might lock up entirely, and prevent you
from being able to access a command prompt. Only then will your last resort be to do a
forced reboot (just pressing the restart button on the case).

Kernel is at the nucleus of a computer. It makes the communication between the

hardware and software possible. While the Kernel is the innermost part of an operating
system, a shell is the outermost one. A shell in a Linux operating system takes input
from you in the form of commands, processes it, and then gives an output. It is the
interface through which a user works on the programs, commands and scripts. A shell
is accessed by a terminal which runs it.

When you run the terminal, the Shell issues a command prompt (usually $), where
you can type your input, which is then executed when you hit the Enter key. The output
or the result is thereafter displayed on the terminal. The Shell wraps around the
delicate interior of an Operating system protecting it from accidental damage. Hence
the name Shell.

There are two main shells in Linux:

1. The Bourne Shell: The prompt for this shell is $ and its derivatives are listed below:

POSIX shell also known as sh

Korn Shell also known as sh
Bourne Again SHell also known as bash (most popular)

2. The C shell: The prompt for this shell is % and its subcategories are:

C shell also known as csh

Tops C shell also known as tcsh

Writing a series of command for the shell to execute is called shell scripting.It
can combine lengthy and repetitive sequences of commands into a single and
simple script, which can be stored and executed anytime. This reduces the effort
required by the end user. "#!" is an operator called shebang which directs the script to
the interpreter location. So, if we use"#! /bin/sh" the script gets directed to the bourne-
shell. Variables store data in the form of characters and numbers. Similarly, Shell
variables are used to store information and they can by the shell only.

Command Description
Bg To send a process to background
Fg To run a stopped process in foreground
Top Details on all Active Processes
Ps Give the status of processes running for a user
ps PID Gives the status of a particular process
Pidof Gives the Process ID (PID) of a process
kill PID Kills a process
Nice Starts a process with a given priority
Renice Changes priority of an already running process
Df Gives free hard disk space on your system
Free Gives free RAM on your system
Any running program or a command given to a Linux system is called a
process
A process could run in foreground or background
The priority index of a process is called Nice in Linux. Its default value is 0
and it can vary between 20 to -19
The lower the Niceness index the higher would be priority given to that task

Some Commands:

1) mv
The mv command - move - allows a user to move a file to another folder or directory.
Just like dragging a file located on a PC desktop to a folder stored within the
"Documents" folder, the mv command functions in the same manner.

2) man
The man command - the manual command - is used to show the manual of the
inputted command. Just like a film on the nature of film, the man command is the meta
command of the Linux CLI. Inputting the man command will show you all information
about the command you are using.

man cd: The inputting command will show the manual or all relevant information for
the change directory command.

3) mkdir
The mkdir - make directory - command allows the user to make a new directory. Just
like making a new directory within a PC or Mac desktop environment, the mkdir
command makes new directories in a Linux environment.

4) rmdir
The rmdir - remove directory - command allows the user to remove an existing
command using the Linux CLI.

Both the mkdir and rmdir commands make and remove directories. They do not
make files and they will also not remove a directory which has files in it. The
mkdir will make an empty directory and the rmdir command will remove an
empty directory.

5) touch
The touch command - a.k.a. the make file command - allows users to make files using
the Linux CLI. Just as the mkdir command makes directories, the touch command
makes files. Just as you would make a .doc or a .txt using a PC desktop, the touch
command makes empty files.

6) locate
The locate - a.k.a. find - command is meant to find a file within the Linux OS. If you
don't know the name of a certain file or you aren't sure where the file is saved and
stored, the locate command comes in handy.

Perl is a programming language especially designed for text editing. It is now

widely used for a variety of purposes including Linux system administration,
network programming, web development etc. Perl is of great importance in a
Linux operating system where it can be used to create programs, handle
Databases and e-mails, GUI (Graphical User Interface) development,
Networking and System Administration. Perl files have .pl extension. There are
three types of variables in Perl, Scalar, Lists and Hashes.

Even though, shell scripting is available to programmers, they

prefer Perl because:

Programming on Perl does not cause portability issues, which is common when
using different shells in shell scripting.
Error handling is very easy on Perl
You can write long and complex programs on Perl easily due to its vastness.
This is in contrast with Shell that does not support namespaces , modules ,
object , inheritance etc.
Shell has fewer reusable libraries available . Nothing compared to Perl's CPAN
Shell is less secure. Its calls external functions(commands like mv , cp etc
depend on the shell being used) . On the contrary Perl does useful work while
using internal functions.
C/C++
The C language was developed in 1972 by Dennis Ritchie at Bell Telephone
laboratories, primarily as a systems programming language. That is, a language to
write operating systems with. Richies primary goals were to produce a minimalistic
language that was easy to compile, allowed efficient access to memory, produced
efficient code, and did not need extensive run-time support. Thus, for a high-level
language, it was designed to be fairly low-level, while still encouraging platform-
independent programming.
C++ (pronounced see plus plus) was developed by Bjarne Stroustrup at Bell Labs as
an extension to C, starting in 1979. C++ adds many new features to the C language,
and is perhaps best thought of as a superset of C, though this is not strictly true as
C99 introduced a few features that do not exist in C++. C++s claim to fame results
primarily from the fact that it is an object-oriented language. As for what an object is
and how it differs from traditional programming methods, well, well cover that in
chapter 8 (Basic object-oriented programming).
C++ is an Object Oriented Programming language but is not purely Object Oriented.
Its features like Friend and Virtual, violate some of the very important OOPS features,
rendering this language unworthy of being called completely Object Oriented. Its a
middle level language.

Following features of C++ makes it a stronger language than C,

1. There is Stronger Type Checking in C++.

2. All the OOPS features in C++ like Abstraction, Encapsulation, Inheritance etc
makes it more worthy and useful for programmers.
3. C++ supports and allows user defined operators (i.e Operator Overloading) and
function overloading is also supported in it.
4. Exception Handling is there in C++.
5. The Concept of Virtual functions and also Constructors and Destructors for
Objects.
6. Inline Functions in C++ instead of Macros in C language. Inline functions make
complete function body act like Macro, safely.
7. Variables can be declared anywhere in the program in C++, but must be declared
before they are used.

Header files are included at the beginning just like in C program. Here iostream is a
header file which provides us with input & output streams. Header files contained
predeclared function libraries, which can be used by users for their ease.
Using namespace std, tells the compiler to use standard namespace. Namespace
collects identifiers used for class, object and variables. NameSpace can be used by
two ways in a program, either by the use of using statement at the beginning, like we
did in above mentioned program or by using name of namespace as prefix before the
identifier with scope resolution (::) operator.
main(), is the function which holds the executing part of program its return type is int.
cout <<, is used to print anything on screen, same as printf in C
language. cin and cout are same as scanf and printf, only difference is that you do not
need to mention format specifiers like, %d for int etc, in cout & cin.

A library is a collection of precompiled code (e.g. functions) that has been packaged
up for reuse in many different programs. Libraries provide a common way to extend
what your programs can do. The C++ core language is actually very small and
minimalistic (and youll learn most of it in these tutorials). However, C++ also comes
with a library called the C++ standard library that provides additional functionality for
your use. The C++ standard library is divided into areas (sometimes also called
libraries, even though theyre just parts of the standard library), each of which focus on
providing a specific type of functionality. One of the most commonly used parts of the
C++ standard library is the iostream library, which contains functionality for writing to
the screen and getting input from a console user.

Variables:
"Variable is a memory location in C++ Programming language".Variable are used
to store data on memory.

Why we use variables in C++ language?

Variables are used to store value and that values can be change.
The values of variables can be numeric or alphabet.

There are certain rules on choosing variable name:

o Variable name can consist of letter, alphabets and start with underscore
character.
o First character of variable should always be alphabet and cannot be digit.
o Blank spaces are not allowed in variable name.
o Special characters like #, $ are not allowed.
o A single variable can only be declared for only 1 data type in a program.
o As C++ is case sensitive language so if we declare a variable name and one
more NAME both are two different variables.
o C++ has certain keywords which cannot be used for variable name.
o A variable name can be consist of 31 characters only if we declare a variable
more than 1 characters compiler will ignore after 31 characters.

Data type in C++ language

"A data type defines which kind of data will store in variables and also defines
memory storage of data"
There are two kinds of data types User defined data types and Standard data
types. In the beginning we will discuss only standard data types only.

In C++ language there are four main data types

1. int
2. float
3. double
4. char

Character data type

o A keyword char is used for character data type.

o This data type is used to represents letters or symbols.
o A character variable occupies 1 byte on memory.

Sensitive case:-
If we chose char data type then we try to store an integer in it like then compiler will
give an error if we put single quotes around 5 like that '5'.then compiler will store it as
an character not integer.
ASCII VALUES OF CHARACTERS

o There is an ASCII value for every character in C++ Language.

o On the basis of these ASCII values characters can be compare, subtract or add.
o Below chart is showing all the ASCII values of characters.
o Please do not confuse as your first look at table pay attention to only
red characters and decimal values and ignore other values.

Integer data types

Integers are those values which has no decimal part they can be positive or negative.
like 12 or -12.

There are 3 data types for integers

1. int
2. short int
3. long int

int

o int keyword is used for integers.

o It takes two bytes in memory.
o There are two more types of int data type
a) Signed int or short int (2nd type of integer data type)
b) Unsigned int or unsigned short int

Signed int
The range of storing value of signed int variable is -32768 to 32767.It can interact with
both positive and negative value.

Unsigned int

This type of integers cannot handle negative values. Its rang is 0 to 65535.

Long int

o As name refers it is used to represent larger integers.

o It takes 4byte in memory.
o The range of long int is -2147483648 to 2147483648.

float

o This type of integers contains fractional part.

o There are two further type of float integer's number.
Type Typical Bit Typical Range
Width

char 1byte -128 to 127 or 0 to 255

unsigned char 1byte 0 to 255

signed char 1byte -128 to 127

int 4bytes -2147483648 to 2147483647

unsigned int 4bytes 0 to 4294967295

signed int 4bytes -2147483648 to 2147483647

short int 2bytes -32768 to 32767

unsigned 2bytes 0 to 65,535

short int

signed short 2bytes -32768 to 32767

int

long int 8bytes -9,223,372,036,854,775,808 to

9,223,372,036,854,775,807

signed long 8bytes -9,223,372,036,854,775,808 to

int 9,223,372,036,854,775,807

unsigned long 8bytes 0 to 18,446,744,073,709,551,615

int

float 4bytes +/- 3.4e +/- 38 (~7 digits)

double 8bytes +/- 1.7e +/- 308 (~15 digits)

long double 8bytes +/- 1.7e +/- 308 (~15 digits)

wchar_t 2 or 4 bytes 1 wide character

Uses endl, which inserts a new-line character after every line and << operator
is being used to pass multiple values out to the screen. We are also
using sizeof() operator to get size of various data types.

Enum as Data type

Enumerated type declares a new type-name and a sequence of value containing
identifiers which has values starting from 0 and incrementing by 1 every time.

Scope of Variables
All the variables have their area of functioning, and out of that boundary they don't hold
their value, this boundary is called scope of the variable. For most of the cases its
between the curly braces,in which variable is declared that a variable exists, not
outside it. We will study the storage classes later, but as of now, we can broadly divide
variables into two main types,

Global Variables
Local variables

Global variables
Global variables are those, which ar once declared and can be used throughout the
lifetime of the program by any class or any function. They must be declared outside
the main() function. If only declared, they can be assigned different values at different
time in program lifetime. But even if they are declared and initialized at the same time
outside the main() function, then also they can be assigned any value at any point in
the program.
Local Variables
Local variables are the variables which exist only between the curly braces, in which its
declared. Outside that they are unavailable and leads to compile time error.

A constant in C++ means an unchanging value and each constant has a type but
does not have location except the string constant.
Integer Constants
Integer constants consist of one or more digits such as 0,1,2,3,4 or -115. Floating
point constants contain a decimal point such as 4.15, -10.05. It can also be written in
scientific notation such as 1E-35 means 1*10^-35 or -1E35 means 1*10^35.
Character Constants
Character constants specify the numeric value of that particular character such as a
is the value of a. Some special constants are in the following table.

Special Character Constants (Escape Sequence)

Constant Description Constant Description

\a Alarm \\ Backslash itself

\b Backspace \' Single Quote itself

\f Form Feed \'' double Quotes itself

\n Line Feed \? question Mark

\r Carriage Return \o Null termination character

\t Horizontal Tab \0 Octal Code

\v Vertical Tab \xh Hexadecimal Code

String Constants
String constants consist of characters enclosed in double quotes such as
Hello, World
The string is stored in the memory and the numeric value of that constant is the
address of this memory. The string constant is suffixed by \0, (the null character) by
the compiler.
Both C and C++ use escape sequence in the same manner such as \n character to
produce a new line. All escape sequences are preceded by a backslash, which
indicates a special sequence to the compiler. The compiler as a single character views
each and every escape sequence.It seems and may have been expected that an
escape sequence occupies 2 bytes which is wrong, it occupies only one byte.

Operators:

Operators are special symbols used for specific purposes. C++ provides many
operators for manipulating data.
Generally, there are six type of operators : Arithmetical operators, Relational
operators, Logical operators, Assignment operators, Conditional operators, Comma
operator.

Arithmetical operators
Arithmetical operators +, -, *, /, and % are used to performs an arithmetic (numeric)
operation.
Operator Meaning
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus
You can use the operators +, -, *, and / with both integral and floating-point data types.
Modulus or remainder % operator is used only with the integral data type.
Binary operators
Operators that have two operands are called binary operators.
Unary operators
C++ provides two unary operators for which only one variable is required.
For Example
a = - 50;
a = + 50;
Here plus sign (+) and minus sign (-) are unary because they are not used between
two variables.

Relational operators
The relational operators are used to test the relation between two values. All relational
operators are binary operators and therefore require two operands. A relational
expression returns zero when the relation is false and a non-zero when it is true. The
following table shows the relational operators.
Relational Operators Meaning
< Less than
<= Less than or equal to
== Equal to
> Greater than
>= Greater than or equal to
!= Not equal to
Logical operators
The logical operators are used to combine one or more relational expression. The
logical operators are
Operators Meaning
|| OR
&& AND
! NOT

Assignment operator
The assignment operator '=' is used for assigning a variable to a value. This operator
takes the expression on its right-hand-side and places it into the variable on its left-
hand-side. For example:
m = 5;

The operator takes the expression on the right, 5, and stores it in the variable on the
left, m.
x = y = z = 32;

This code stores the value 32 in each of the three variables x, y, and z.

in addition to standard assignment operator shown above, C++ also support

compound assignment operators.
Compound Assignment Operators
Operator Example Equivalent to
+= A+=2 A=A+2
-= A-=2 A=A-2
%= A%=2 A=A%2
/= A/ = 2 A=A/2
*= A*=2 A=A*2

Increment and Decrement Operators

C++ provides two special operators viz '++' and '--' for incrementing and decrementing
the value of a variable by 1. The increment/decrement operator can be used with any
type of variable but it cannot be used with any constant. Increment and decrement
operators each have two forms, pre and post.
The syntax of the increment operator is:
Pre-increment: ++variable
Post-increment: variable++
The syntax of the decrement operator is:
Pre-decrement: variable
Post-decrement: variable
In Prefix form first variable is first incremented/decremented, then evaluated
In Postfix form first variable is first evaluated, then incremented/decremented
int x, y;
int i = 10, j = 10;
x = ++i; //add one to i, store the result back in x
y = j++; //store the value of j to y then add one to j
cout << x; //11
cout << y; //10

Conditional operator
The conditional operator ?: is called ternary operator as it requires three operands. The
format of the conditional operator is:
Conditional_ expression ? expression1 : expression2;
If the value of conditional expression is true then the expression1 is evaluated,
otherwise expression2 is evaluated.
int a = 5, b = 6;
big = (a > b) ? a : b;

The condition evaluates to false, therefore big gets the value from b and it becomes 6.
The comma operator
The comma operator gives left to right evaluation of expressions. When the set of
expressions has to be evaluated for a value, only the rightmost expression is
considered.
int a = 1, b = 2, c = 3, i; // comma acts as separator, not as an operator
i = (a, b); // stores b into i

Would first assign the value of a to i, and then assign value of b to variable i. So, at the
end, variable i would contain the value 2.

The sizeof operator

As we know that different types of Variables, constant, etc. require different amounts of
memory to store them The sizeof operator can be used to find how many bytes are
required for an object to store in memory. For example
sizeof (char) returns 1
sizeof (float) returns 4

the sizeof operator determines the amount of memory required for an object at compile
time rather than at run time.

The order of Precedence

The order in which the Arithmetic operators (+,-,*,/,%) are used in a. given expression
is called the order of precedence. The following table shows the order of precedence.
Order Operators
First ()
Second *, /, %
Third +, -

The following table shows the precedence of operators.

++, --(post increment/decrement)
++ (Pre increment) -- (Pre decrement), sizeof ( ), !(not),
Highest
-(unary), +(unary)
*,/, % To
+, -
<, <=, >, >=
==,!=
&& Lowest
?:
=
Comma operator

Loops in programming comes into use when we need to repeatedly execute a

block of statements.
There are mainly two types of loops:
1. Entry Controlled loops: In this type of loops the test condition is tested before
entering the loop body. For Loop and While Loop are entry controlled loops.
2. Exit Controlled Loops: In this type of loops the test condition is tested or
evaluated at the end of loop body. Therefore, the loop body will execute atleast
once, irrespective of whether the test condition is true or false. do while loop is
exit controlled loop.
There are 3 type of loops in C++ language

1. while loop
2. for loop
3. do-while loop

while loop
while loop can be address as an entry control loop. It is completed in 3 steps.

Variable initialization.( e.g int x=0; )

condition( e.g while( x<=10) )
Variable increment or decrement ( x++ or x-- or x=x+2 )

Syntax:
variable(initialization);
while(condition)
{
statements;
variable increment or decrement;
}

for loop
for loop is used to execute a set of statement repeatedly until a particular condition is
satisfied. we can say it an open ended loop. General format is,
Syntax:
for(initialization;condition;increment/decrement;)
{
statement;
}
do while loop
In some situations it is necessary to execute body of the loop before testing the
condition. Such situations can be handled with the help of do-while loop. do statement
evaluates the body of the loop first and at the end, the condition is checked
using while statement. General format of do-while loop is,
Syntax:
do{

.
}
while(condition);

C language allows jumping from one statement to another within a loop as well as
jumping out of the loop.
1) break statement
When break statement is encountered inside a loop, the loop is immediately exited
and the program continues with the statement immediately following the loop.
2) continue statement
It causes the control to go directly to the test-condition and then continue the loop
process. On encountering continue, cursor leave the current cycle of loop, and starts
with the next cycle.

Storage class of a variable defines the lifetime and visibility of a variable. Lifetime
means the duration till which the variable remains active and visibility defines in which
module of the program the variable is accessible. There are five types of storage
classes in C++. They are:

1. Automatic

2. External

3. Static

4. Register

5. Mutable
auto: This is the default storage class for all the variables declared inside a function or
a block. Hence, the keyword auto is rarely used while writing programs in C language.
Auto variables can be only accessed within the block/function they have been declared
and not outside them (which defines their scope). Of course, these can be accessed
within nested blocks within the parent block/function in which the auto variable was
declared. However, they can be accessed outside their scope as well using the
concept of pointers given here by pointing to the very exact memory location where the
variables resides. They are assigned a garbage value by default whenever they are
declared.

extern: Extern storage class simply tells us that the variable is defined elsewhere and
not within the same block where it is used. Basically, the value is assigned to it in a
different block and this can be overwritten/changed in a different block as well. So an
extern variable is nothing but a global variable initialized with a legal value where it is
declared in order to be used elsewhere. It can be accessed within any function/block.
Also, a normal global variable can be made extern as well by placing the extern
keyword before its declaration/definition in any function/block. This basically signifies
that we are not initializing a new variable but instead we are using/accessing the global
variable only. The main purpose of using extern variables is that they can be accessed
between two different files which are part of a large program.

static: This storage class is used to declare static variables which are popularly used
while writing programs in C language. Static variables have a property of preserving
their value even after they are out of their scope! Hence, static variables preserve the
value of their last use in their scope. So we can say that they are initialized only once
and exist till the termination of the program. Thus, no new memory is allocated
because they are not re-declared. Their scope is local to the function to which they
were defined. Global static variables can be accessed anywhere in the program. By
default, they are assigned the value 0 by the compiler.

register: This storage class declares register variables which have the same
functionality as that of the auto variables. The only difference is that the compiler tries
to store these variables in the register of the microprocessor if a free register is
available. This makes the use of register variables to be much faster than that of the
variables stored in the memory during the runtime of the program. If a free register is
not available, these are then stored in the memory only. Usually few variables which
are to be accessed very frequently in a program are declared with the register keyword
which improves the running time of the program. An important and interesting point to
be noted here is that we cannot obtain the address of a register variable using
pointers.
mutalble: The mutable specifier applies only to class objects, which are discussed
later in this tutorial. It allows a member of an object to override constness. That is, a
mutable member can be modified by a const member function.

Function:
A function is a block of code that performs some operation. A function can optionally
define input parameters that enable callers to pass arguments into the function. A
function can optionally return a value as output. Functions are useful for encapsulating
common operations in a single reusable block, ideally with a name that clearly
describes what the function does. Every C++ program has at least one function, which
is main(), and all the most trivial programs can define additional functions.

Depending on whether a function is predefined or created by programmer; there are

two types of function:

1.Library Function: Library functions are the built-in function in C++

programming.Programmer can use library function by invoking function directly; they
don't need to write it themselves.

2.User-defined Function: C++ allows programmer to define their own function. A user-
defined function groups code to perform a specific task and that group of code is given
a name(identifier).When the function is invoked from any part of program, it all
executes the codes defined in the body of function.

Syntax:
return-type function-name(parameters)
{
//function body;
}

return-type : suggests what the function will return. It can be int, char, some
pointer or even a class object. There can be functions which does not return
anything, they are mentioned with void.
Function Name : is the name of the function, using the function name it is called.
Parameters : are variables to hold values of arguments passed while function is
called. A function may or may not contain parameter list.
Function body : is he part where the code statements are written.

Functions are called by their names. If the function is without argument, it can be
called directly using its name. But for functions with arguments, we have two ways to
call them,
1.Call by Value :In this calling technique we pass the values of arguments which are
stored or copied into the formal parameters of functions. Hence, the original values are
unchanged only the parameters inside function changes.

2. Call by Reference: In this we pass the address of the variable as arguments. In this
case the formal parameter can be taken as a reference or a pointer, in both the case
they will change the values of the original variable.

Default Value of Parameters: When you define a function, you can specify a default
value for each of the last parameters. This value will be used if the corresponding
argument is left blank when calling to the function. This is done by using the
assignment operator and assigning values for the arguments in the function definition.
If a value for that parameter is not passed when the function is called, the default
given value is used, but if a value is specified, this default value is ignored and the
passed value is used instead.

When function is called within the same function, it is known as recursion in C++. The
function which calls the same function, is known as recursive function.

A function that calls itself, and doesn't perform any task after function call, is known as
tail recursion. In tail recursion, we generally call the same function with return
statement.

Arrays: C++ provides a data structure, the array, which stores a fixed-size sequential
collection of elements of the same type. An array is used to store a collection of data,
but it is often more useful to think of an array as a collection of variables of the same
type. Instead of declaring individual variables, such as number0, number1, ..., and
number99, you declare one array variable such as numbers and use numbers[0],
numbers[1], and ..., numbers[99] to represent individual variables. A specific element
in an array is accessed by an index. All arrays consist of contiguous memory
locations. The lowest address corresponds to the first element and the highest
address to the last element.

Declaring Arrays
To declare an array in C++, the programmer specifies the type of the elements and
the number of elements required by an array as follows:

type arrayName[arraySize];

This is called a single-dimension array. The arraySize must be an integer constant

greater than zero and type can be any valid C++ data type.

Concept Description

Multi-dimensional arrays C++ supports multidimensional arrays. The

simplest form of the multidimensional array is the
two-dimensional array.
Pointer to an array You can generate a pointer to the first element of
an array by simply specifying the array name,
without any index.
Passing arrays to functions You can pass to the function a pointer to an
array by specifying the array's name without an
index.
Return array from functions C++ allows a function to return an array.

Strings in C++

String is a collection of characters. There are two types of strings commonly used in
C++ programming language:

Strings that are objects of string class (The Standard C++ Library string class)
C-strings (C-style Strings)

C-strings
In C programming, the collection of characters is stored in the form of arrays, this is
also supported in C++ programming. Hence it's called C-strings.

C-strings are arrays of type char terminated with null character, that is, \0 (ASCII value
of null character is 0).

String Object:

In C++, you can also create a string object for holding strings.

Unlike using char arrays, string objects has no fixed length, and can be extended as
per your requirement.

S.N. Function & Purpose

1 strcpy(s1, s2);

Copies string s2 into string s1.

2 strcat(s1, s2);
Concatenates string s2 onto the end of string s1.

3 strlen(s1);
Returns the length of string s1.

4 strcmp(s1, s2);
Returns 0 if s1 and s2 are the same; less than 0 if s1<s2; greater than 0 if
s1>s2.

5 strchr(s1, ch);
Returns a pointer to the first occurrence of character ch in string s1.

6 strstr(s1, s2);
Returns a pointer to the first occurrence of string s2 in string s1.

Pointer in C++:

The pointer in C++ language is a variable, it is also known as locator or indicator that
points to an address of a value.
Advantage of pointer

1) Pointer reduces the code and improves the performance, it is used to retrieving
strings, trees etc. and used with arrays, structures and functions.

2) We can return multiple values from function using pointer.

3) It makes you able to access any memory location in the computer's memory.

Usage of pointer

There are many usage of pointers in C++ language.

1) Dynamic memory allocation

In c language, we can dynamically allocate memory using malloc() and calloc()

functions where pointer is used.

2) Arrays, Functions and Structures

Pointers in c language are widely used in arrays, functions and structures. It reduces
the code and improves the performance.

Symbol Name Description

& (ampersand sign) Address operator Determine the address of a varia
(asterisk sign) Indirection operator Access the value of an address.

OOPs:

Object Oriented Programming is a paradigm that provides many concepts such

as inheritance, data binding, polymorphism etc.

The programming paradigm where everything is represented as an object is known as

truly object-oriented programming language. Smalltalk is considered as the first truly
object-oriented programming language.
OOPs (Object Oriented Programming System)

Object means a real word entity such as pen, chair, table etc. Object-Oriented
Programming is a methodology or paradigm to design a program using classes and
objects. It simplifies the software development and maintenance by providing some
concepts:

o Object
o Class
o Inheritance
o Polymorphism
o Abstraction
o Encapsulation

Object: Any entity that has state and behavior is known as an object. For
example: chair, pen, table, keyboard, bike etc. It can be physical and logical.

Class: Collection of objects is called class. It is a logical entity.

Inheritance: When one object acquires all the properties and behaviours of parent
object i.e. known as inheritance. It provides code reusability. It is used to achieve
runtime polymorphism.

Polymorphism: When one task is performed by different ways i.e. known as

polymorphism. For example: to convince the customer differently, to draw
something e.g. shape or rectangle etc. In C++, we use Function overloading and
Function overriding to achieve polymorphism.

Abstraction: Hiding internal details and showing functionality is known as

abstraction. For example: phone call, we don't know the internal processing. In
C++, we use abstract class and interface to achieve abstraction.

Encapsulation: Binding (or wrapping) code and data together into a single unit is
known as encapsulation. For example: capsule, it is wrapped with different
medicines.

Advantage of OOPs over Procedure-oriented programming language

1. OOPs makes development and maintenance easier where as in Procedure-
oriented programming language it is not easy to manage if code grows as
project size grows.
2. OOPs provide data hiding whereas in Procedure-oriented programming
language a global data can be accessed from anywhere.
3. OOPs provide ability to simulate real-world event much more effectively. We
can provide the solution of real word problem if we are using the Object-
Oriented Programming language.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts

Classes:

1. Class name must start with an uppercase letter(Although this is not mandatory). If
class name is made of more than one word, then first letter of each word must be in
uppercase. Example,

Class Studyregular

2. Classes contain, data members and member functions, and the access of these
data members and variable depends on the access specifiers (discussed in next
section).
3. Class's member functions can be defined inside the class definition or outside the
class definition.
4. Class in C++ are similar to structures in C, the only difference being, class defaults
to private access control, where as structure defaults to public.
5. All the features of OOPS, revolve around classes in C++. Inheritance,
Encapsulation, Abstraction etc.
6. Objects of class holds separate copies of data members. We can create as many
objects of a class as we need.
7. Classes do posses more characteristics, like we can create abstract classes,
immutable classes, all this we will study later.

Objects
Class is mere a blueprint or a template. No storage is assigned when we define a
class. Objects are instances of class, which holds the data variables declared in class
and the member functions work on these class objects.Each object has different data
variables. Objects are initialised using special class functions called Constructors. We
will study about constructors later. And whenever the object is out of its scope, another
special class member function called Destructor is called, to release the memory
reserved by the object. C++ doesn't have Automatic Garbage Collector like in JAVA, in
C++ Destructor performs this task.

Access Specifier in Classes:

C++ offers possibility to control access to class members and functions by using
access specifiers. Access specifiers are used to protect data from misuse. Access
specifiers in the program, are followed by a colon. You can use either one, two or all 3
specifiers in the same class to set different boundaries for different class members.
They change the boundary for all the declarations that follow them.

Types of Access Specifier Classes:

1)Public
Public, means all the class members declared under public will be available to
everyone. The data members and member functions declared public can be accessed
by other classes too. Hence there are chances that they might change them. So the
key members must not be declared public.
2)Private
Private keyword, means that no one can access the class members declared private
outside that class. If someone tries to access the private member, they will get a
compile time error. By default class variables and member functions are private.
3)Protected
Protected class members and functions can be used inside its class. Protected
members and functions cannot be accessed from other classes directly.
Additionally protected access specifier allows friend functions and classes to access
these data members and functions. Protected data members and functions can be
used by the class derived from this class.

Accessing Data Member of Class:

Accessing data members of a class depends upon the access specifiers of these
members. Sometimes there is a necessity to provide access even to private data
members. In this case technique of Accessors (getters) and Mutators (setters) are
used. If its public, then the data member can be easily accessed using the direct
member access (.) operator with the object of that class.
The access to private data members outside of class is allowed only to friend classes
or functions. But you can specify a special function Accessor to get the value
of private data member. In this case a good style is to start name of this function
with get, followed by the data member identifier. The return type must be the same as
type of the data member.
Mutators (setters) are used to set values of private data members. One of the main
goals of a mutator is to check correctness of the value to be set to data member. A
setters name starts with set, followed by name of data member. Setter must have
one parameter of the same type as the data member to bet set by the setter.
Protected data members can be accessed in the same way as public data members
from friend functions or classes and from derived classes. If you need to provide
access to protected data members from any other functions, you will have to use the
same technique as for accessing private data members.

Member functions are part of C++ classes. Member functions represent behavior
of a class. All member functions can be divided into the following categories:

1. Simple functions: Simple functions are functions that do not have any
specific keyword used in declaration. They do not have any special behavior
and manipulate with data members of a class. The syntax used for
declaration of a simple member functions is:

ReturnType FunctionName (ParameterList);

2. Const functions: Const function is used to specify a read-only function.

This type of function cant modify any non-static data members or call any
other non-const member functions. If you want to declare a constant function
you have to add const keyword after parenthesis of parameter list

ReturnType FunctionName (ParameterList) const;

The const keyword must be used both in declaration and implementation of

the member function.

3. Static functions: Static functions have class scope. They cant modify any
non-static data members or call non-static member functions. Static members
functions do not have implicit this argument. Thats why they can work only
with static members of class. Static member functions can be declared using
the following format:

static ReturnType FunctionName (ParameterList);

Static member functions cant be declared virtual.

4. Inline functions: Inline functions are declared by using inline keyword. The
purpose of inline functions is discussed in details in Inline functions. All the
functions that are implemented inside the class declaration are inline member
functions.

5. Virtual functions: Virtual member function is a member functions that is

expected to be overloaded in the derived class. Virtual functions are used in
inheritance they provide correct behavior when you call a function from
derived class that is declared virtual in base class by using a pointer to the
base class.

Although, friend functions are not member function, we will discuss the use of friend
functions too. Friend functions can access even private members of a class. Friend
function is a function that is not member function of a class, but it has access to
private and protected members of a class. Friend function is declared and
implemented outside of class as a simple function. But the class has to grant friend
privileges by declaring this function with friend keyword inside of the class
declaration.

Function overloading :
A feature in C++ that enables several functions of the same name can be defined with
different types of parameters or different number of parameters. This feature is called
function overloading. The appropriate function will be identified by the compiler by
examining the number or the types of parameters / arguments in the overloaded
function. Function overloading reduces the investment of different function names and
used to perform similar functionality by more than one function.

Operator overloading :
A feature in C++ that enables the redefinition of operators. This feature operates on
user defined objects. All overloaded operators provides syntactic sugar for function
calls that are equivalent. Without adding to / changing the fundamental language
changes, operator overloading provides a pleasant faade.

Constructor
It is a member function having same name as its class and which is used to initialize
the objects of that class type with a legel initial value. Constructor is automatically
called when object is created.
Types of Constructor
Default Constructor-: A constructor that accepts no parameters is known as default
constructor. If no constructor is defined then the compiler supplies a default
constructor.
Circle :: Circle()
{
radius = 0;
}
Parameterized Constructor -: A constructor that receives arguments/parameters, is
called parameterized constructor.
Circle :: Circle(double r)
{
radius = r;
}
Copy Constructor-: A constructor that initializes an object using values of another
object passed to it as parameter, is called copy constructor. It creates the copy of the
passed object.
Circle :: Circle(Circle &t)
{
radius = t.radius;
}

There can be multiple constructors of the same class, provided they have different
signatures.
Destructor
A destructor is a member function having sane name as that of its class preceded by
~(tilde) sign and which is used to destroy the objects that have been created by a
constructor. It gets invoked when an objects scope is over.
~Circle() {}

A few points to note:

Both of the functions have the same name as that of the class, destructor function
having (~) before its name.
Both constructor and destructor functions should not be preceded by any data type
(not even void).
These functions do not (and cannot) return any values.
We can have only the constructor function in a class without destructor function or
vice-versa.
Constructor function can take arguments but destructors cannot.
Constructor function can be overloaded as usual functions.
Explicit call to the constructor: - By explicit call to the constructor,means that the
constructor is explicitly declared by the programmer inside the class.

Implicit call to the constructor: - By implicit call to the constructor,means that the
constructor is implicitly provided by the Compiler when an Object of the Class is
created and there is no Explicit Constructor defined inside the Class.
C++ Enumeration:

Enum in C++ is a data type that contains fixed set of constants. It can be used for days
of the week (SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY
and SATURDAY) , directions (NORTH, SOUTH, EAST and WEST) etc. The C++
enum constants are static and final implicitly. C++ Enums can be thought of as classes
that have fixed set of constants.

Points to remember for C++ Enum

o enum improves type safety

o enum can be easily used in switch
o enum can be traversed
o enum can have fields, constructors and methods
o enum may implement many interfaces but cannot extend any class because it
internally extends Enum class

Inheritance:The capability of a class to derive properties and characteristics from

another class is called Inheritance. Inheritance is one of the most important feature of
Object Oriented Programming.
Sub Class: The class that inherits properties from another class is called Sub class or
Derived Class.
Super Class:The class whose properties are inherited by sub class is called Base
Class or Super class.

Purpose of Inheritance

1. Code Reusability
2. Method Overriding (Hence, Runtime Polymorphism.)
3. Use of Virtual Keyword

Base class Public Mode Private Mode Protected Mode

Private Not Inherited Not Inherited Not Inherited

Protected Protected Private Protected

Public Public Private Protected

In C++, we have 5 different types of Inheritance:

1. Single Inheritance
2. Multiple Inheritance
3. Hierarchical Inheritance
4. Multilevel Inheritance
5. Hybrid Inheritance (also known as Virtual Inheritance)

Single Inheritance
In this type of inheritance one derived class inherits from only one base class. It is the
most simplest form of Inheritance.

Multiple Inheritance
In this type of inheritance a single derived class may inherit from two or more than two
base classes.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Hierarchical Inheritance
In this type of inheritance, multiple derived classes inherits from a single base class.

Multilevel Inheritance
In this type of inheritance the derived class inherits from a class, which in turn inherits
from some other class. The Super class for one, is sub class for the other.
Hybrid (Virtual) Inheritance
Hybrid Inheritance is combination of Hierarchical and Mutilevel Inheritance.

Why is Base class Constructor called inside Derived class

Constructors have a special job of initializing the object properly. A Derived class
constructor has access only to its own class members, but a Derived class object also
have inherited property of Base class, and only base class constructor can properly
initialize base class members. Hence all the constructors are called, else object
wouldn't be constructed properly.

Upcasting and Downcasting in C++:

Upcasting and downcasting are an important part of C++. Upcasting and downcasting
gives a possibility to build complicated programs with a simple syntax. It can be
achieved by using Polymorphism. C++ allows that a derived class pointer (or
reference) to be treated as base class pointer. This is upcasting. Downcasting is an
opposite process, which consists in converting base class pointer (or reference) to
derived class pointer.

Upcasting is a process of treating a pointer or a reference of derived class object as a

base class pointer. You do not need to upcast manually. You just need to assign
derived class pointer (or reference) to base class pointer: When you use upcasting, the
object is not changing. Nevertheless, when you upcast an object, you will be able to
access only member functions and data members that are defined in the base class.
One of the biggest advantage of upcasting is the capability of writing generic functions
for all the classes that are derived from the same base class.

Downcasting is an opposite process for upcasting. It converts base class pointer to

derived class pointer. Downcasting must be done manually. It means that you have to
specify explicit type cast. Downcasting is not safe as upcasting. You know that a
derived class object can be always treated as base class object.

Polymorphism in C++: The process of representing one Form in multiple forms is

known as Polymorphism. Here one form represent original form or original method
always resides in base class and multiple forms represents overridden method
which resides in derived classes. Polymorphism is derived from 2 greek
words: poly and morphs. The word "poly" means many and morphs means forms.
So polymorphism means many forms.

There are two types of polymorphism in C++:

o Compile time polymorphism: It is achieved by function overloading and

operator overloading which is also known as static binding or early binding.
o Runtime polymorphism: It is achieved by method overriding which is also
known as dynamic binding or late binding.

Compile time polymorphism:

In C++ programming you can achieve compile time polymorphism in two way, which
is given below;

Method overloading

Method overriding

Method Overloading in C++

Whenever same method name is exiting multiple times in the same class with
different number of parameter or different order of parameters or different types of
parameters is known as method overloading.

Method Overriding in C++

Define any method in both base class and derived class with same name, same
parameters or signature, this concept is known as method overriding.

In C++ Run time polymorphism can be achieve by using virtual function. Virtual
Function is a function in base class, which is overrided in the derived class, and which
tells the compiler to perform Late Binding on this function.
Virtual Keyword is used to make a member function of the base class Virtual.
1. Only the Base class Method's declaration needs the Virtual Keyword, not the
definition.
2. If a function is declared as virtual in the base class, it will be virtual in all its derived
classes.
3. The address of the virtual Function is placed in the VTABLE and the copiler
uses VPTR(vpointer) to point to the Virtual Function.

Abstract Class in C++:

Abstract Class is a class which contains atleast one Pure Virtual function in it. Abstract
classes are used to provide an Interface for its sub classes. Classes inheriting an
Abstract Class must provide definition to the pure virtual function, otherwise they will
also become abstract class.
Characteristics of Abstract Class

1. Abstract class cannot be instantiated, but pointers and refrences of Abstract class
type can be created.
2. Abstract class can have normal functions and variables along with a pure virtual
function.
3. Abstract classes are mainly used for Upcasting, so that its derived classes can use
its interface.
4. Classes inheriting an Abstract Class must implement all pure virtual functions, or
else they will become Abstract too.

Pure Virtual Functions

Pure virtual Functions are virtual functions with no definition. They start
with virtual keyword and ends with = 0.

Why can't we create Object of Abstract Class :

When we create a pure virtual function in Abstract class, we reserve a slot for a
function in the VTABLE(studied in last topic), but doesn't put any address in that slot.
Hence the VTABLE will be incomplete. As the VTABLE for Abstract class is
incomplete, hence the compiler will not let the creation of object for such class and will
display an errror message whenever you try to do so.
Interface in C++: An interface is a description of what member functions must a class, which
inherits this interface, implement. In other words, an interface describes behavior of the class. You
can imagine an interface as a list of functions that must be implemented by a class. An interface is
created by using __interface keyword. An interface can inherit functions from one or more base
interface. It can contain only public member functions. In contrast, it cannot contain constructor,
destructor, data members, and static member functions.

Exception in C++: Exception is an event that happens when unexpected circumstances appear. It
can be a runtime error or you can create an exceptional situation programmatically. Exception
handling consists in transferring control from the place where exception happened to the special
functions (commands) called handlers. Exceptions are handled by using try/catch block. The code
that can produce an exception is surrounded with try block. The handler for this exception is placed
in catch block.

Encapsulation: In Object Oriented Programming, encapsulation represents binding data and

functions into one container. This container hides the details of the data and the way functions
process data. In C++, Class is a container that binds data and functions. The mechanism of hiding
details of a class is called abstraction and it is described in "C++ Abstraction". Class encapsulates
all manipulations with the data.
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
JAVA
Java technology is both a programming language and a platform. The Java
programming language is a high-level language. In the Java programming language,
all source code is first written in plain text files ending with the .java extension. Those
source files are then compiled into .class files by the javac compiler. A .class file does
not contain code that is native to your processor; it instead contains bytecodes the
machine language of the Java Virtual Machine (Java VM).The java launcher tool then
runs your application with an instance of the Java Virtual Machine.

James Gosling - founder of java.

Java is a high-level programming language originally developed by Sun Microsystems

and released in 1995.

Java team members (also known as Green Team), initiated a revolutionary task to
develop a language for digital devices such as set-top boxes, televisions etc.

For the green team members, it was an advance concept at that time. But, it was
suited for internet programming. Later, Java technology as incorporated by Netscape.

Currently, Java is used in internet programming, mobile devices, games, e-business

solutions etc. There are given the major points that describes the history of java.

Features of Java:

1) Simple

Java is easy to learn and its syntax is quite simple, clean and easy to understand.The
confusing and ambiguous concepts of C++ are either left out in Java or they have been
re-implemented in a cleaner way.
Eg : Pointers and Operator Overloading are not there in java but were an important
part of C++.

2) Object Oriented

In java everything is Object which has some data and behaviour. Java can be easily
extended as it is based on Object Model.
3) Robust

Java makes an effort to eliminate error prone codes by emphasizing mainly on compile
time error checking and runtime checking. But the main areas which Java improved
were Memory Management and mishandled Exceptions by introducing
automatic Garbage Collector and Exception Handling.

4) Platform Independent

Unlike other programming languages such as C, C++ etc which are compiled into
platform specific machines. Java is guaranteed to be write-once, run-anywhere
language.
On compilation Java program is compiled into bytecode. This bytecode is platform
independent and can be run on any machine, plus this bytecode format also provide
security. Any machine with Java Runtime Environment can run Java Programs.

5) Secure

When it comes to security, Java is always the first choice. With java secure features it
enable us to develop virus free, temper free system. Java program always runs in Java
runtime environment with almost null interaction with system OS, hence it is more
secure.

6) Multi Threading

Java multithreading feature makes it possible to write program that can do many tasks
simultaneously. Benefit of multithreading is that it utilizes same memory and other
resources to execute multiple threads at the same time, like While typing, grammatical
errors are checked along.
7) Architectural Neutral

Compiler generates bytecodes, which have nothing to do with a particular computer

architecture, hence a Java program is easy to intrepret on any machine.

8) Portable

Java Byte code can be carried to any platform. No implementation dependent features.
Everything related to storage is predefined, example: size of primitive data types

9) High Performance

Java is an interpreted language, so it will never be as fast as a compiled language like

C or C++. But, Java enables high performance with the use of just-in-time compiler.

New Features of JAVA 8:

Below mentioned are some of the core upgrades done as a part of Java 8 release.
Just go through them quickly, we will explore them in details later.

Enhanced Productivity by providing Optional Classes feature, Lamda Expressions,

Streams etc.
Ease of Use
Improved Polyglot programming. A Polyglot is a program or script, written in a
form which is valid in multiple programming languages and it performs the same
operations in multiple programming languages. So Java now supports such type of
programming technique.
Improved Security and performance.

Java Application:

There are mainly 4 type of applications that can be created using java:-
1) Standalone Application/Desktop Application
2) Web Application
3) Enterprise Application
4) Mobile Application
1) Standalone Application/Desktop Application :-
It is also known as desktop application or window-based application. An application
that we need to install on every machine such as media player, antivirus etc. AWT and
Swing are used in java for creating standalone applications.

2) Web Application :-
An application that runs on the server side and creates dynamic page, is called web
application. Currently, servlet, jsp, struts, jsf etc. technologies are used for creating
web applications in java.

3) Enterprise Application :-
An application that is distributed in nature, such as banking applications etc. It has the
advantage of high level security, load balancing and clustering. In java, EJB is used for
creating enterprise applications.

4) Mobile Application :-
An application that is created for mobile devices. Currently Android and Java ME are
used for creating mobile applications.

The Java programming language defines the following kinds of variables:

i) Instance Variables (Non-Static Fields)

ii) Class Variables (Static Fields)
iii) Local Variables

i) Instance Variables (Non-Static Fields):- Technically speaking, objects store their

individual states in "non-static fields", that is, fields declared without the static
keyword.

Non-static fields are also known as instance variables because their values are unique
to each instance of a class (to each object, in other words); the currentSpeed of one
bicycle is independent from the currentSpeed of another.

ii) Class Variables (Static Fields):- A class variable is any field declared with the
static modifier; this tells the compiler that there is exactly one copy of this variable in
existence, regardless of how many times the class has been instantiated.

A field defining the number of gears for a particular kind of bicycle could be marked as
static since conceptually the same number of gears will apply to all instances. The
code static int numGears = 6; would create such a static field. Additionally, the
keyword final could be added to indicate that the number of gears will never change.

iii) Local Variables:- Similar to how an object stores its state in fields, a method will
often store its temporary state in local variables. The syntax for declaring a local
variable is similar to declaring a field (for example, int count = 0;).
There is no special keyword designating a variable as local; that determination comes
entirely from the location in which the variable is declared which is between the
opening and closing braces of a method.

As such, local variables are only visible to the methods in which they are declared;
they are not accessible from the rest of the class.

JVM:
Java virtual Machine(JVM) is a virtual Machine that provides runtime environment to
execute java byte code. The JVM doesn't understand Java typo, that's why you
compile your *.java files to obtain *.class files that contain the bytecodes
understandable by the JVM. JVM control execution of every Java program. It enables
features such as automated exception handling, Garbage-collected heap.

JRE : The Java Runtime Environment (JRE) provides the libraries, the Java Virtual
Machine, and other components to run applets and applications written in the Java
programming language. JRE does not contain tools and utilities such as compilers or
debuggers for developing applets and applications.

JDK : The JDK also called Java Development Kit is a superset of the JRE, and
contains everything that is in the JRE, plus tools such as the compilers and debuggers
necessary for developing applets and applications.

JIT: It is the set of programs developed by SUN Micro System and added as a part of
JVM, to speed up the interpretation phase.
In the older version of java compilation phase is so faster than interpretation phase.
Industry has complained to the SUN Micro System saying that compilation phase is
very faster and interpretation phase is very slow.
So solve this issue, SUN Micro System has developed a program called JIT (just in
time compiler) and added as a part of JVM to speed up the interpretation phase. In the
current version of java interpretation phase is so faster than compilation phase. Hence
java is one of the highly interpreted programming languages.
Class Loader : Class loader loads the Class for execution.
Method area : Stores pre-class structure as constant pool.
Heap : Heap is in which objects are allocated.
Stack : Local variables and partial results are store here. Each thread has a private JVM stack
created when the thread is created.
Program register : Program register holds the address of JVM instruction currently being
executed.
Native method stack : It contains all native used in application.
Executive Engine : Execution engine controls the execute of instructions contained in the methods
of the classes.
Native Method Interface : Native method interface gives an interface between java code and
native code during execution.
Native Method Libraries : Native Libraries consist of files required for the execution of native
code.

Typecasting: Assigning a value of one type to a variable of another type is known

as Type Casting.

Two Types of Typecasting:

1)Widening or Automatic type converion
Automatic Type casting take place when,

the two types are compatible

the target type is larger than the source type

2)Narrowing or Explicit type conversion

When you are assigning a larger type value to a variable of smaller type, then you
need to perform explicit type casting.

Java JDK Directory Structure:

Directory Description

/jdk1.5.0 The root directory of the JDK software installtion.

Contains copyright, license, and Readme files. Also
contains src.jar, the archive of source code for the java 2
platform.

/jdk1.5.0/bin The executables for all the development tools contained

in the Java 2 JDK. The PATH environment variable
should contain an entry for this directory. For more
information on the tools.

/jdk1.5.0/lib Files used by the development tools. Includes tools.jar,

which contains non-core classes for support of the tools
and utilities in the JDK.

/jdk1.5.0/jre The root directory of the Java runtime environment used

by the JDK development tools. The runtime environment
is an implementation of Java 2 platform. This is the
directory referred to by the java.home system property.

/jdk1.5.0/jre/bin Executable files for tools and libraries used by the Java
platform. The executable files are identical to files in
/jdk1.5.0/bin. The java launcher tool serves as an
application launcher, in place of the old jre tool that
shipped with 1.1 versions of the JDK software. This
directory does not need to be in the PATH environment
variable.
/jdk1.5.0/jre/lib Code libraries, property settings, and resource files used
by the Java runtime environment.

/jdk1.5.0/jre/lib/ext Default installation directory for Extensions to the Java

platform. This is where the JavaHelp jar file goes when it
is installed.

/jdk1.5.0/jre/lib/security Contains files used for security management.These

include the security policy (java.policy) and security
properties (java.security) files.

/jdk1.5.0/jre/lib/i386/client Contains the .so file used by the Java HotSpot Client
Virtual Machine, which is implemented with Java
HotSpot technology. This is the default VM.

/jdk1.5.0/jre/lib/i386/server Contains the .so file used by the Java HotSpot Server
Virtual Machine.

/jdk1.5.0/jre/lib/applet Jar files containing support classes for applets can be

placed in the lib/applet/ directory.

/jdk1.5.0/jre/lib/fonts Font files for use by platform

Java Basic Tools:

Tool Name Brief Description

javac
The compiler for the Java programming language.

java
The launcher for Java applications.

javadoc
API documentation generator.
appletviewer
Run and debug applets without a web browser.

jar
Create and manage Java Archive (JAR) files.
jdb
The Java Debugger.

javah
C header and stub generator. Used to write native
methods.

javap
Class file disassembler

extcheck
Utility to detect Jar conflicts.

Java Class Path is required for using tools such as javac, java etc. If you are saving
the java file in jdk/bin folder, path is not required.But If you are having your java file
outside the jdk/bin folder, it is necessary to set path of JDK.
There are two ways to set java class path of JDK:
1. Temporary
2. Permanent

Java Simple Program

//Start of public DemoProgram class

public class DemoProgram
{
//Start of Main Method
public static void main(string args[])
{
System.out.println(" Simple demo program");

} //End of Main Method

} //End of DemoProgram Class

How to Compile and Run the java program:-

i) Save java program with extensiion .java

ii) Compile java program
javac DemoProgram.java
iii) Run java program
java DemoProgram

this keyword

this keyword is used to refer to current object.

this is always a reference to the object on which method was invoked.
this can be used to invoke current class constructor.
this can be passed as an argument to another method.

Garbage Collection
In Java destruction of object from memory is done automatically by the JVM. When
there is no reference to an object, then that object is assumed to be no longer needed
and the memory occupied by the object are released. This technique is
called Garbage Collection. This is accomplished by the JVM.
Unlike C++ there is no explicit need to destroy object.
Can the Garbage Collection be forced explicitly ?
No, the Garbage Collection can not be forced explicitly. We may request JVM
for garbage collection by calling System.gc() method. But This does not guarantee
that JVM will perform the garbage collection.
Advantages of Garbage Collection:

1. Programmer doesn't need to worry about dereferencing an object.

2. It is done automatically by JVM.
3. Increases memory efficiency and decreases the chances for memory leak.

finalize() method
Sometime an object will need to perform some specific task before it is destroyed such
as closing an open connection or releasing any resources held. To handle such
situation finalize() method is used. finalize()method is called by garbage collection
thread before collecting object. Its the last chance for any object to perform cleanup
utility.
gc() Method
gc() method is used to call garbage collector explicitly. However gc() method does not
guarantee that JVM will perform the garbage collection. It only request the JVM for
garbage collection. This method is present in System and Runtime class.

Object Cloning:

The object cloning is a way to create exact copy of an object. For this purpose,
clone() method of Object class is used to clone an object. The java.lang.Cloneable
interface must be implemented by the class whose object clone we want to create. If
we don't implement Cloneable interface, clone() method
generates CloneNotSupportedException.

The clone() method is defined in the Object class.

Why use clone() method :

The clone() method saves the extra processing task for creating the exact copy of an
object. If we perform it by using the new keyword, it will take a lot of processing to be
performed that is why we use object cloning.

Wrapper class in java provides the mechanism to convert primitive into object and
object into primitive.

Since J2SE 5.0, autoboxing and unboxing feature converts primitive into object and
object into primitive automatically. The automatic conversion of primitive into object is
known as autoboxing and vice-versa unboxing.

The eight classes of java.lang package are known as wrapper classes in java. The list
of eight wrapper classes are given below:

Primitive Type Wrapper class

Boolean Boolean
Char Character
Byte Byte
Short Short
Int Integer
Long Long
Float Float
Double Double
Aggregation (HAS-A):
HAS-A relationship is based on usage, rather than inheritance. In other words, class
A has-a relationship with class B, if code in class A has a reference to an instance of
class B. Aggregation allow you to design classes that follow good Object Oriented
practices. It also provide code reusability.
What is Composition in java?
Composition is restricted form of Aggregation. Composition can be described as when
one class which owns the other class, is destroyed then the other class cannot exist
meaningfully without it.
When to use Inheritance and Aggregation?
When you need to use property and behaviour of a class without modifying it inside
your class. In such case Aggregation is a better option. Whereas when you need to
use and modify property and behaviour of a class inside your class, its best to
use Inheritance.

Instanceof Operator:

The java instanceof operator is used to test whether the object is an instance of the
specified type (class or subclass or interface). The instanceof in java is also known as
type comparison operator because it compares the instance with type. It returns either
true or false. If we apply the instanceof operator with any variable that has null value, it
returns false.

Java Package
Package are used in Java, in-order to avoid name conflicts and to control access of
class, interface and enumeration etc. A package can be defined as a group of similar
types of classes, interface, enumeration and sub-package. Using package it becomes
easier to locate the related classes.
Package are categorized into two forms:

Built-in Package:-Existing Java package for example java.lang, java.util etc.

User-defined-package:- Java package created by user to categorized classes and
interface
Additional points on package:

A package is always defined in a separate folder having the same name as a

package name.
Define all classes in that package folder.
All classes of the package which we wish to access outside the package must be
declared public.
All classes within the package must have the package statement as its first line.
All classes of the package must be compiled before use (So that its error free)

What is Abstraction
Abstraction is process of hiding the implementation details and showing only the
functionality.
Abstraction in java is achieved by using interface and abstract class. Interface give
100% abstraction and abstract class give 0-100% abstraction.

What is Abstract class in Java:

A class that is declared as abstract is known as abstract class.

Syntax:
abstract class <class-name>{}

An abstract class is something which is incomplete and you cannot create instance of
abstract class.
If you want to use it you need to make it complete or concrete by extending it.
A class is called concrete if it does not contain any abstract method and
implements all abstract method inherited from abstract class or interface it has
implemented or extended.

What is Abstract method in Java:

A method that is declare as abstract and does not have implementation is known as
abstract method.
If you define abstract method than class must be abstract.

Syntax:

abstract return_type method_name ();

An abstract method in Java doesn't have body, its just a declaration. In order to use
abstract method you need to override that method in Subclass.

Some advantages to use encapsulation in Java Code.

- Encapsulated Code is more flexible and easy to change with new requirements.
- By providing only getter and setter method access, you can make the class read
only.
- Encapsulation in Java makes unit testing easy.
- A class can have total control over what is stored in its fields. Suppose you want to
set the value of marks field i.e. marks should be positive value, than you can write the
logic of positive value in setter method.
- Encapsulation also helps to write immutable class in Java which are a good choice in
multi-threading environments.
- Encapsulation allows you to change one part of code without affecting other part of
code.

Java String:

A sequence of character data enclosed in double quotes is called a string. Strings of

Java work differently from that of C and C + +. In C, string is an array of characters
with a terminating \ 0. But in Java, string is an object of String class.That is
manipulation of strings is quite different from C and in Java, it is very easy due the rich
methods of String class. For example, we can concatenate strings with + operator.
Java platform provides two string classes to manipulate strings String, for constant
strings and StringBuffer, for strings that can change. Strings are immutable. That is,
strings once created cannot be changed at the same memory location.Whenever a
new value is assigned to a string, a new memory location is created and the new value
is stored in it and the old location( with old value ) is garbage collected. It is definitely a
overhead to the operating system. But this is made to increase the performance and
for the same reason String class is declared as final.

The following the class signature of String class defined in java.lang package:

public final class String extends Object implements Serializable, Comparable

To overcome the OS overhead( due to immutable nature of strings), we can use

StringBuffer class.

Some methods of String class:

1. valueOf( parameter ) :
valueOf() method is static and is overloaded many times in String class. It's job is to
convert any primitive data type or object, passed as parameter, into a string form.
Its function similar to toString() method of Object class. But toString() method converts
only objects into string form.

2. length( ) :

length( ) is an instance method in String class which returns an int value. It must be
called with an instance of String class and returns the number of characters present in
the string instance.

3.equals( ) :
equals( ) method is inherited from Object class and is overridden in String class. It
returns a boolean value of true if the strings are same or false, if the strings are
different. In the comparison, case( upper or lower) of the letters is considered.

4. toLowerCase( ) and toUpperCase( ) :

toLowerCase( ) is an instance method and converts all uppercase letters of the string
into lowercase. toUpperCase( ) does in the reverse way.

Important Note:

String is a Final class; i.e once created the value cannot be altered. Thus
String objects are called immutable.
The Java Virtual Machine(JVM) creates a memory location especially for
Strings called String Constant Pool. Thats why String can be initialized
without new key word.
String class falls under java.lang.String hierarchy. But there is no need to
import this class. Java platform provides them automatically.
String reference can be overridden but that does not delete the content.
Multiple references can be used for same String but it will occur in the
same place.

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Java StringBuffer class is used to created mutable (modifiable) string. The
StringBuffer class in java is same as String class except it is mutable i.e. it can
be changed. Java StringBuffer class is thread-safe i.e. multiple threads cannot
access it simultaneously. So it is safe and will result in an order.

Important Constructors of StringBuffer class

1. StringBuffer(): creates an empty string buffer with the initial capacity of 16.
2. StringBuffer(String str): creates a string buffer with the specified string.
3. StringBuffer(int capacity): creates an empty string buffer with the specified
capacity as length.

Important methods of StringBuffer class:

1. public synchronized StringBuffer append(String s): is used to append the
specified string with this string. The append() method is overloaded like
append(char), append(boolean), append(int), append(float), append(double)
etc.
2. public synchronized StringBuffer insert(int offset, String s): is used to
insert the specified string with this string at the specified position. The insert()
method is overloaded like insert(int, char), insert(int, boolean), insert(int, int),
insert(int, float), insert(int, double) etc.
3. public synchronized StringBuffer replace(int startIndex, int endIndex,
String str): is used to replace the string from specified startIndex and endIndex.
4. public synchronized StringBuffer delete(int startIndex, int endIndex): is
used to delete the string from specified startIndex and endIndex.
5. public synchronized StringBuffer reverse(): is used to reverse the string.
6. public int capacity(): is used to return the current capacity.
7. public void ensureCapacity(int minimumCapacity): is used to ensure the
capacity at least equal to the given minimum.
8. public char charAt(int index): is used to return the character at the specified
position.
9. public int length(): is used to return the length of the string i.e. total number of
characters.
10. public String substring(int beginIndex): is used to return the substring from
the specified beginIndex.
11. public String substring(int beginIndex, int endIndex): is used to return the
substring from the specified beginIndex and endIndex.

A string that can be modified or changed is known as mutable string. StringBuffer and
StringBuilder classes are used for creating mutable string.

Java Multihreading:

Thread :-
A thread is a single sequential( separate ) flow of control within program. Sometimes, it
is called an execution context or light weight process. A thread itself is not a program.
A thread can not run on it's own( as it is a part of a program ). Rather, it runs within a
program. A program can be divided into a number of packets of code ---- each
representing a thread having its own separate flow of control.

Light weight process: A thread is considered a light weight process because it runs
within the context of a program and takes advantage of the resources allocated to that
program.

Heavy weight process: In the heavy weight process, the control changes in between
threads belonging to different processes. ( In light weight process, the control changes
in between threads belonging to same(one) process ).
Execution contest: A thread will have its own execution stack and program counter.
The code running within the thread works only within that context.

One of the strengths of Java is its support for multithreading. All the classes needed to
write a multithreaded program are included in the default imported package java.lang
through class Object, class Thread and interface Runnable.

Synchronization: At times when more than one thread try to access a shared
resource, we need to ensure that resource will be used by only one thread at a time.
The process by which this is achieved is called synchronization. The synchronization
keyword in java creates a block of code referred to as critical section.
Syntax:
Synchronized(object)
{
//statement to be synchronized
}

Every Java object with a critical section of code gets a lock associated with the object.
To enter critical section a thread need to obtain the corresponding object's lock.
Why we use Syncronization ?
If we do not use syncronization, and let two or more threads access a shared resource
at the same time, it will lead to distorted results.
Consider an example, Suppose we have two different threads T1 and T2, T1 starts
execution and save certain values in a file temporary.txt which will be used to calculate
some result when T1 returns. Meanwhile, T2 starts and before T1 returns, T2 change
the values saved by T1 in the file temporary.txt (temporary.txt is the shared resource).
Now obviously T1 will return wrong result.
To prevent such problems, synchronization was introduced. With synchronization in
above case, once T1 starts using temporary.txt file, this file will be locked(LOCK
mode), and no other thread will be able to access or modify it until T1 returns.
Using Synchronized Methods
Using Synchronized methods is a way to accomplish synchronization. But lets first see
what happens when we do not use synchronization in our program.
In Java, synchronized keyword causes a performance cost. A synchronized method in
Java is very slow and can degrade performance. So we must use synchronization
keyword in java when it is necessary else, we should use Java synchronized block that
is used for synchronizing critical section only.
Interthread Communication
Java provide benefits of avoiding thread pooling using inter-thread communication.
The wait(), notify(), and notifyAll() methods of Object class are used for this purpose.
These method are implemented as finalmethods in Object, so that all classes have
them. All the three method can be called only from within a synchronized context.

wait() tells calling thread to give up monitor and go to sleep until some other thread
enters the same monitor and call notify.
notify() wakes up a thread that called wait() on same object.
notifyAll() wakes up all the thread that called wait() on same object.

wait() sleep()

called from synchronised block no such requirement

monitor is released monitor is not released

gets awake when notify() or notifyAll() does not get awake when notify() or notifyAll()
method is called. method is called

not a static method static method

wait() is generaly used on condition sleep() method is simply used to put your
thread on sleep.

Thread Pooling
Pooling is usually implemented by loop i.e to check some condition repeatedly. Once
condition is true appropriate action is taken. This waste CPU time.

Java - Byte Code & Unicode

Java introduces two new words to the computer world:-
i) Bytecode and
ii) Unicode.

i) Bytecode:
When we compile a .java file, we get a .class file. The .class file can run on any
operating system irrespective of platform on which it was compiled. For this reason,
Java is called platform independent. But the .exe file of C language is not platform
independent.

.exe file contains binary code. Java's .class file contains bytecode. This bytecode
makes Java cross platform. Java compiler produces bytecodes. Any JVM, can run
these bytecode and produce output.

Bytecode is a machine-independent intermediate language known to a Java

interpreter. Java interpreter parses bytecode into an output. That is why Java is said,
"write once, run anywhere".

ii) Unicode:
ASCII(extended) character range is 0 to 255. We cannot add one more character, if we
do want. Only English alphabets has got corresponding ASCII values. That is why we
cannot write a C program in any other language than English.

Java's motto is internationalization. That is, it supports many world languages, like
Telugu, Kannada, Greek, Japanese etc. That is, there is a corresponding
ASCII(Unicode) value in Java for all these international languages.

This is possible due to the size of character of 2 bytes. That is, the character can
represent values ranging from 0 to 65,535. This range is called Unicode. We can say
ASCII is a subset of Unicode.

Upto 255, Unicode represents ASCII range and afterwards it adds its own values for
the alphabets of many world languages. Unicode is already includes up to 34,128
characters.

type wrapper:
Java uses primitive data types such as int, double, float etc. to hold the basic data
types for the sake of performance. Despite the performance benefits offered by the
primitive data types, there are situations when you will need an object representation
of the primitive data type. For example, many data structures in Java operate on
objects. So you cannot use primitive data types with those data structures. To handle
such type of situations, Java provides type Wrappers which provide classes that
encapsulate a primitive type within an object.
Autoboxing and Unboxing:

Autoboxing and Unboxing features was added in Java5.

Autoboxing is a process by which primitive type is automatically
encapsulated(boxed) into its equivalent type wrapper
Auto-Unboxing is a process by which the value of an object is automatically
extracted from a type Wrapper class.

Benefits of Autoboxing / Unboxing:

1. Autoboxing / Unboxing lets us use primitive types and Wrapper class objects
interchangeably.
2. We don't have to perform Explicit typecasting.
3. It helps prevent errors, but may lead to unexpected results sometimes. Hence must
be used with care.
4. Auto-unboxing also allows you to mix different types of numeric objects in an
expression. When the values are unboxed, the standard type conversions can be
applied.

Java Networking: Java is a premier language for network

programming. java.net package encapsulate large number of classes and interface
that provides an easy-to use means to access network resources.

Socket is foundation of modern networking, a socket allows single computer to serve

many different clients at once. Socket establishes connection through the use of port,
which is a numbered socket on a particular machine. Socket communication takes
place via a protocol. Socket provides communication mechanism between two
computers using TCP. There are two kind of TCP sockets in Java. One is for server
and other is for client.

ServerSocket is for servers.

Socket class is for client.
Java programs can be divided into two categories :-

i) Applications and
ii) Applets.

Applications are the programs that contain main( ) method and applets are the
programs that do not contain main( ) method. Applications can be executed with a
Java interpreter from the command line( with java command). Applets need a browser
to execute.

Applet: Applet is a Java program that runs on a browser.

S.No. Property Application Applet

1. main( ) exists does not exist
method
2. Nature stand-alone programs can't be stand-alone
3. Execution needs a Java interpreter needs a browser like Netscape,
chrome etc.
4. Security does not need any hard disk files
security needs top-most
security for
5. Restrictions no hard disk accessing can't access hard disk files, by
restrictions default
6. Extra can share any software can't share eg. ActiveX controls in
Software available the system
7. Plug-ins latest additions of can't use plug-ins directly including
software can be browser plug-ins that are
embedded through plug- incorporated on the user's system.
ins

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
Java JDBC is a java API to connect and execute query with the database. JDBC API
uses jdbc drivers to connect with the database.

Why use JDBC

Before JDBC, ODBC API was the database API to connect and execute query with the
database. But, ODBC API uses ODBC driver which is written in C language (i.e.
platform dependent and unsecured). That is why Java has defined its own API (JDBC
API) that uses JDBC drivers (written in Java language).

What is API: API (Application programming interface) is a document that contains

description of all the features of a product or software. It represents classes and
interfaces that software programs can follow to communicate with each other. An API
can be created for applications, libraries, operating systems, etc.

JDBC Driver is a software component that enables java application to interact with
the database.There are 4 types of JDBC drivers:

1. JDBC-ODBC bridge driver

2. Native-API driver (partially java driver)
3. Network Protocol driver (fully java driver)
4. Thin driver (fully java driver)

The JDBC-ODBC bridge driver uses ODBC driver to connect to the database. The
JDBC-ODBC bridge driver converts JDBC method calls into the ODBC function
calls. This is now discouraged because of thin driver.
Java Regex:

The Java Regex or Regular Expression is an API to define pattern for searching or
manipulating strings.

It is widely used to define constraint on strings such as password and email

validation. After learning java regex tutorial, you will be able to test your own regular
expressions by the Java Regex Tester Tool.

Java Regex API provides 1 interface and 3 classes in java.util.regex package.

RMI:

The RMI (Remote Method Invocation) is an API that provides a mechanism to create
distributed application in java. The RMI allows an object to invoke methods on an
object running in another JVM. The RMI provides remote communication between the
applications using two objects stub and skeleton.

Networking Ports:
Port Service name Transport protocol
20, 21 File Transfer Protocol (FTP) TCP
22 Secure Shell (SSH) TCP and UDP
23 Telnet TCP
25 Simple Mail Transfer Protocol (SMTP) TCP
50, 51 IPSec
53 Domain Name Server (DNS) TCP and UDP
67, 68 Dynamic Host Configuration Protocol (DHCP) UDP
69 Trivial File Transfer Protocol (TFTP) UDP
80 Hyper Text Transfer Protocol (HTTP) TCP
110 Post Office Protocol (POP3) TCP
119 Network News Transport Protocol (NNTP) TCP
123 Network Time Protocol (NTP) UDP
135-139 NetBIOS TCP and UDP
143 Internet Message Access Protocol (IMAP4) TCP and UDP
161, 162 Simple Network Management Protocol (SNMP) TCP and UDP
389 Lightweight Directory Access Protocol TCP and UDP
443 HTTP with Secure Sockets Layer (SSL) TCP and UDP

Generation of Computers:

1940 1956: First Generation Vacuum Tubes

These early computers used vacuum tubes as circuitry and magnetic drums for
memory. As a result they were enormous, literally taking up entire rooms and costing a
fortune to run. These were inefficient materials which generated a lot of heat, sucked
huge electricity and subsequently generated a lot of heat which caused ongoing
breakdowns.

These first generation computers relied on machine language (which is the most
basic programming language that can be understood by computers). These computers
were limited to solving one problem at a time. Input was based on punched cards and
paper tape. Output came out on print-outs. The two notable machines of this era were
the UNIVAC and ENIAC machines the UNIVAC is the first every commercial
computer which was purchased in 1951 by a business the US Census Bureau.

1956 1963: Second Generation Transistors

The replacement of vacuum tubes by transistors saw the advent of the second
generation of computing. Although first invented in 1947, transistors werent used
significantly in computers until the end of the 1950s. They were a big improvement
over the vacuum tube, despite still subjecting computers to damaging levels of heat.
However they were hugely superior to the vacuum tubes, making computers smaller,
faster, cheaper and less heavy on electricity use. They still relied on punched card for
input/printouts.

The language evolved from cryptic binary language to symbolic (assembly)

languages. This meant programmers could create instructions in words. About the
same time high level programming languages were being developed (early versions of
COBOL and FORTRAN). Transistor-driven machines were the first computers to store
instructions into their memories moving from magnetic drum to magnetic core
technology. The early versions of these machines were developed for the atomic
energy industry.

1964 1971: Third Generation Integrated Circuits

By this phase, transistors were now being miniaturised and put on silicon chips (called
semiconductors). This led to a massive increase in speed and efficiency of these
machines. These were the first computers where users interacted using keyboards
and monitors which interfaced with an operating system, a significant leap up from the
punch cards and printouts. This enabled these machines to run several applications at
once using a central program which functioned to monitor memory.

As a result of these advances which again made machines cheaper and smaller, a
new mass market of users emerged during the 60s.

1972 2010: Fourth Generation Microprocessors

This revolution can be summed in one word: Intel. The chip-maker developed the Intel
4004 chip in 1971, which positioned all computer components (CPU, memory,
input/output controls) onto a single chip. What filled a room in the 1940s now fit in the
palm of the hand. The Intel chip housed thousands of integrated circuits. The year
1981 saw the first ever computer (IBM) specifically designed for home use and 1984
saw the MacIntosh introduced by Apple. Microprocessors even moved beyond the
realm of computers and into an increasing number of everyday products.

The increased power of these small computers meant they could be linked, creating
networks. Which ultimately led to the development, birth and rapid evolution of the
Internet. Other major advances during this period have been the Graphical user
interface (GUI), the mouse and more recently the astounding advances in lap-top
capability and hand-held devices.

2010- : Fifth Generation Artificial Intelligence

Computer devices with artificial intelligence are still in development, but some of these
technologies are beginning to emerge and be used such as voice recognition.

AI is a reality made possible by using parallel processing and superconductors.

Leaning to the future, computers will be radically transformed again by quantum
computation, molecular and nano technology.

The essence of fifth generation will be using these technologies to ultimately create
machines which can process and respond to natural language, and have capability to
learn and organise themselves.
TYPES OF COMPUTER:

Personal computer:

It maintain above list of hardware and software components. It can be defined as a

small that range up to the limited pounds. It appeared in the year of 1970s that work
with small CPU, RAM and memory chips. It is useful to the word processing,
accounting, desktop, database management applications and etc. Number of home
users use this software for play games and learn anything from internet easily.

Personal computer maintains several kinds of computers such as following.

1. Notebook

2. Tower computer

3. Laptop

4. Subnotebook

5. Handheld

6. Plamtop

7. PDA

Mini Computer:

It is a midsize computer useful in work stations that can cover 200 users
simultaneously.

Workstation:

It designs for engineering applications SDLC and various kinds of applications with
moderate power and graphic technologies. It generally maintains high storage media
along with large RAM. Workstation only work by the UNIX and Linux operating
systems. It has several types of storage media that maintain both diskless and disk
drive workstations.
Supercomputer and Mainframe:

Supercomputer is best fastest computer in world that is very expensive. It work based
on mathematical calculations so, everything work well with simple procedure. For
example, weather forecasting requires a supercomputer. Other uses of
supercomputers scientific simulations, (animated) graphics, fluid dynamic calculations,
nuclear energy research, electronic design, and analysis of geological data (e.g. in
petrochemical prospecting). Perhaps the best known supercomputer manufacturer is
Cray Research

SBI IT OFFICER 17 JAN 2016 ASKED SOME PREVIOUS YEAR QUESTIONS

TOPIC

1)Embedded SQL is Hard-coded SQL statements in a program language such as

Java.

Embedded SQL is a method of inserting inline SQL statements or queries into the
code of a programming language, which is known as a host language. Because the
host language cannot parse SQL, the inserted SQL is parsed by an embedded SQL
preprocessor.
Embedded SQL is a robust and convenient method of combining the computing power
of a programming language with SQL's specialized data management and
manipulation capabilities.

2)Which data mining technology is used to predict the future? Predictive

Predictive analytics encompasses a variety of statistical techniques from predictive

modeling, machine learning, and data mining that analyze current and historical facts
to make predictions about future or otherwise unknown events.

In business, predictive models exploit patterns found in historical and transactional

data to identify risks and opportunities. Models capture relationships among many
factors to allow assessment of risk or potential associated with a particular set of
conditions, guiding decision making for candidate transactions.

3) A form of multiplexing, which allows numerous signals to occupy a single channel

to optimize bandwidth CDMA
Code division multiplexing (CDM) is a networking technique in which multiple data
signals are combined for simultaneous transmission over a common frequency band.

When CDM is used to allow multiple users to share a single communications

channel, the technology is called code division multiple access (CDMA).

4)Which type of file is a part of oracle database? Control File

Every Oracle Database has a control file, which is a small binary file that records the
physical structure of the database. The control file includes:

The database name

Names and locations of associated datafiles and redo log files

The timestamp of the database creation

The current log sequence number

Checkpoint information

The control file must be available for writing by the Oracle Database server
whenever the database is open. Without the control file, the database cannot be
mounted and recovery is difficult.

The control file of an Oracle Database is created at the same time as the database.
By default, at least one copy of the control file is created during database creation.
On some operating systems the default is to create multiple copies. You should
create two or more copies of the control file during database creation. You can also
create control files later, if you lose control files or want to change particular settings
in the control files.

5)CIDR stands for Classless Inter Domain Routing

CIDR (Classless Inter-Domain Routing, sometimes called supernetting) is a way to
allow more flexible allocation of Internet Protocol (IP) addresses than was possible
with the original system of IP address classes. As a result, the number of available
Internet addresses was greatly increased, which along with widespread use of
network address translation (NAT), has significantly extended the useful life of IPv4.

6)Mirroring in Oracle

Database mirroring is the creation and maintenance of redundant copies of

a database. The purpose is to ensure continuous data availability and minimize or
avoid downtime that might otherwise result from data corruption or loss, or from a
situation when the operation of a network is partially compromised. Redundancy also
ensures that at least one viable copy of a database will always
remain accessible during system upgrades.

7) Which of the following is middle ware technology? CORBA

The Common Object Request Broker Architecture (CORBA) is a standard developed

by the Object Management Group (OMG) to provide interoperability among distributed
objects. CORBA is the world's leading middleware solution enabling the exchange of
information, independent of hardware platforms, programming languages, and
operating systems. CORBA is essentially a design specification for an Object Request
Broker (ORB), where an ORB provides the mechanism required for distributed objects
to communicate with one another, whether locally or on remote devices, written in
different languages, or at different locations on a network.The CORBA Interface
Definition Language, or IDL, allows the development of language and location-
independent interfaces to distributed objects. Using CORBA, application components
can communicate with one another no matter where they are located, or who has
designed them. CORBA provides the location transparency to be able to execute these
applications.CORBA is often described as a "software bus" because it is a software-
based communications interface through which objects are located and accessed. The
illustration below identifies the primary components seen within a CORBA
implementation.

8)A group of servers, If one server is failed and its users are switched instantly to the
other servers is called Cluster.

Microsoft Cluster Server (MSCS) is a computer program that allows server

computers to work together as a computer cluster, to provide failover and increased
availability of applications, or parallel calculating power in case of high-performance
computing (HPC) clusters (as in supercomputing).

Microsoft has three technologies for clustering: Microsoft Cluster Service (MSCS, a HA
clustering service), Component Load Balancing (CLB) (part of Application Center
2000), and Network Load Balancing Services (NLB). In Windows Server
2008 and Windows Server 2008 R2 the MSCS service has been renamed to Windows
Server Failover Clustering and the Component Load Balancing (CLB) feature has been
deprecated.
9)Conversion of message into a form,that cannot be easily understood by
unauthorized people is called encryption.

Encryption is the conversion of electronic data into another form, called ciphertext,
which cannot be easily understood by anyone except authorized parties.Network
encryption (sometimes called network layer, or network level encryption) is a network
security process that applies crypto services at the network transfer layer - above the
data link level, but below the application level. The network transfer layers are layers 3
and 4 of the Open Systems Interconnection (OSI) reference model, the layers
responsible for connectivity and routing between two end points. Using the existing
network services and application software, network encryption is invisible to the end
user and operates independently of any other encryption processes used. Data is
encrypted only while in transit, existing as plaintext on the originating and receiving
hosts.

10)Prototype model is a Systems Development Method (SDM)

The Prototyping Model is a systems development method (SDM) in which

a prototype (an early approximation of a final system or product) is built, tested, and
then reworked as necessary until an acceptable prototype is finally achieved from
which the complete system or product can now be developed. This model works best
in scenarios where not all of the project requirements are known in detail ahead of
time. It is an iterative, trial-and-error process that takes place between the developers
and the users.

Advantages of Prototype model:

Users are actively involved in the development

Since in this methodology a working model of the system is provided, the users get a
better understanding of the system being developed.

Errors can be detected much earlier.

Quicker user feedback is available leading to better solutions.

Missing functionality can be identified easily

Confusing or difficult functions can be identified

Requirements validation, Quick implementation of, incomplete, but
functional, application.

Disadvantages of Prototype model:

Leads to implementing and then repairing way of building systems.

Practically, this methodology may increase the complexity of the system as scope of
the system may expand beyond original plans.

Incomplete application may cause application not to be used as the

full system was designed
Incomplete or inadequate problem analysis.

11) COM stands for Component Object Model

COM is a platform-independent, distributed, object-oriented system for creating binary

software components that can interact. COM is the foundation technology for
Microsoft's OLE (compound documents) and ActiveX (Internet-enabled components)
technologies.COM objects can be created with a variety of programming languages.
Object-oriented languages, such as C++, provide programming mechanisms that
simplify the implementation of COM objects. These objects can be within a single
process, in other processes, even on remote computers.

12) Term used in networks which has header and trailer Packet

A data packet consists of three elements. The first element is a header, which contains
the information needed to get the packet from the source to the destination, and the
second element is a data area, which contains the information of the user who caused
the creation of the packet. The third element of packet is a trailer, which often contains
techniques ensuring that errors do not occur during transmission.During
communication of data the sender appends the header and passes it to the lower layer
while the receiver removes header and passes it to upper layer. Headers are added at
layer 6,5,4,3 & 2 while Trailer is added at layer 2.

13) Project Management Tools. A Gantt chart, Logic Network, PERT chart, Product
Breakdown Structure and Work Breakdown Structure are standard tools used
in project planning.

The program (or project) evaluation and review technique, commonly

abbreviated PERT, is a statistical mathematics tool, used in General project
management, which was designed to analyze and represent the tasks involved in
completing a given project.A Gantt chart, commonly used in project management, is
one of the most popular and useful ways of showing activities (tasks or events)
displayed against time. On the left of the chart is a list of the activities and along the
top is a suitable time scale. Each activity is represented by a bar; the position and
length of the bar reflects the start date, duration and end date of the activity.
14) A network operating system (NOS) is a computer operating system system that is
designed primarily to support workstation, personal computer, and, in some instances,
older terminal that are connected on a local area network (LAN). Artisoft's LANtastic,
Banyan VINES, Novell's NetWare, and Microsoft's LAN Manager are examples of
network operating systems. In addition, some multi-purpose operating systems, such
as Windows NT and Digital's OpenVMS come with capabilities that enable them to be
described as a network operating system.A network operating system provides printer
sharing, common file system and database sharing, application sharing, and the ability
to manage a network name directory, security, and other housekeeping aspects of a
network.

15) If you are on an Intranet, when you cant access internet then what will you
check? Proxy settings

A proxy or proxy server is basically another computer which serves as a hub through
which internet requests are processed. By connecting through one of these servers,
your computer sends your requests to the proxy server which then processes your
request and returns what you were wanting. In this way it serves as an intermediary
between your home machine and the rest of the computers on the internet. Proxies are
used for a number of reasons such as to filter web content, to go around restrictions
such as parental blocks, to screen downloads and uploads and to provide anonymity
when surfing the internet.

Middleware Technology : Middleware is a general term for software that serves to

"glue together" separate, often complex and already existing, programs. Some
software components that are frequently connected with middleware include enterprise
applications and Web services. Middleware often sits between the operating system
and applications on different servers and simplifies the development of applications
that leverage services from other applications. This allows programmers to create
business applications without having to custom craft integrations for each new
application. Typically, middleware programs provide messaging services so that
different applications can communicate using messaging frameworks like Simple
Object Access Protocol (SOAP), Web services, Representational State
Transfer (REST) and JavaScript Object Notation (JSON). The systematic tying
together of disparate applications, often through the use of middleware, is known
as enterprise application integration (EAI).
Domain Name System

:Domain Name System is an Internet service that translates domain names into IP
addresses.
The DNS has a distributed database that resides on multiple machines on the
Internet.
DNS has some protocols that allow the client and servers to communicate with each
other.
When the Internet was small, mapping was done by using hosts.txt file.
The host file was located at host's disk and updated periodically from a master host
file.
When any program or any user wanted to map domain name to an address, the host
consulted the host file and found the mapping.
Now Internet is not small, it is impossible to have only one host file to relate every
address with a name and vice versa.
The solution used today is to divide the host file into smaller parts and store each part
on a different computer.
In this method, the host that needs mapping can call the closest computer holding the
needed information.
This method is used in Domain Name System (DNS).

Name space

The names assigned to the machines must be carefully selected from a name space
with complete control over the binding between the names and IP addresses.
There are two types of name spaces: Flat name spaces and Hierarchical names.

Flat name spaces

In a flat name space, a name is a sequence of characters without structure.

A name in this space is assigned to an address.
The names were convenient and short.
A flat name space cannot be used in a large system such as the internet because it
must be centrally controlled to avoid ambiguity and duplication.
Hierarchical Name Space

In hierarchical name space, each name consists of several parts.

First part defines the nature of the organization, second part defines the name of an
organization, third part defines department of the organization, and so on.
In hierarchical name space, the authority to assign and control the name spaces can
be decentralized.
Authority for names in each partition is passed to each designated agent.

DNS in the Internet

DNS is a protocol that can be used in different platform.

Domain Name Space is divided into different sections in the Internet: Generic domain,
country domain and inverse domain.

Keylogger:
A keylogger is a type of surveillance software (considered to be
either software or spyware) that has the capability to record every keystroke you make
to a log file, usually encrypted. A keylogger recorder can record instant messages, e-
mail, and any information you type at any time using your keyboard. The log file
created by the keylogger can then be sent to a specified receiver. Some
keylogger programs will also record any e-mail addresses you use and Web
site URLsyou visit.
Keyloggers, as a surveillance tool, are often used by employers to ensure employees
use work computers for business purposes only. Unfortunately, keyloggers can also be
embedded in spywareallowing your information to be transmitted to an unknown third
party.
Cloud Computing: Cloud computing is a type of computing that relies on sharing
computing resources rather than having local servers or personal devices to
handle applications.
In cloud computing, the word cloud (also phrased as "the cloud") is used as a
metaphor for "the Internet," so the phrase cloud computing means "a type of Internet-
based computing," where different services such as servers, storage and
applications are delivered to an organization's computers and devices through the
Internet.

Here are a few of the things you can do with the cloud:

Create new apps and services

Store, back up and recover data
Host websites and blogs
Stream audio and video
Deliver software on demand
Analyse data for patterns and make predictions

How it Works
Cloud computing applies traditional supercomputing, or high-performance
computing power, normally used by military and research facilities, to perform tens of
trillions of computations per second. In consumer-oriented applications such as
financial portfolios, to deliver personalized information, to provide data storage or to
power large, immersive online computer games.
To do this, cloud computing uses networks of large groups of servers typically running
low-cost consumer PC technology with specialized connections to spread data-
processing chores across them. This shared IT infrastructure contains large pools of
systems that are linked together. Often, virtualization techniques are used to maximize
the power of cloud computing.

Types of cloud services: IaaS, PaaS, SaaS

Most cloud computing services fall into three broad categories: infrastructure as a
service (IaaS), platform as a service (PaaS) and software as a service (Saas). These
are sometimes called the cloud computing stack, because they build on top of one
another. Knowing what they are and how they are different makes it easier to
accomplish your business goals.

Infrastructure-as-a-service (IaaS)

The most basic category of cloud computing services. With IaaS, you rent IT
infrastructureservers and virtual machines (VMs), storage, networks, operating
systemsfrom a cloud provider on a pay-as-you-go basis.

Platform as a service (PaaS)

Platform-as-a-service (PaaS) refers to cloud computing services that supply an on-

demand environment for developing, testing, delivering and managing software
applications. PaaS is designed to make it easier for developers to quickly create web
or mobile apps, without worrying about setting up or managing the underlying
infrastructure of servers, storage, network and databases needed for development.

Software as a service (SaaS)

Software-as-a-service (SaaS) is a method for delivering software applications over the

Internet, on demand and typically on a subscription basis. With SaaS, cloud providers
host and manage the software application and underlying infrastructure and handle
any maintenance, like software upgrades and security patching.

There are three different ways to deploy cloud computing resources: public
cloud, private cloud and hybrid cloud.

Public cloud

Public clouds are owned and operated by a third-party cloud service provider, which
deliver their computing resources like servers and storage over the Internet. Microsoft
Azure is an example of a public cloud. With a public cloud, all hardware, software and
other supporting infrastructure is owned and managed by the cloud provider. You
access these services and manage your account using a web browser.

Private cloud

A private cloud refers to cloud computing resources used exclusively by a single

business or organisation. A private cloud can be physically located on the companys
on-site datacenter. Some companies also pay third-party service providers to host their
private cloud. A private cloud is one in which the services and infrastructure are
maintained on a private network.

Hybrid cloud

Hybrid clouds combine public and private clouds, bound together by technology that
allows data and applications to be shared between them. By allowing data and
applications to move between private and public clouds, hybrid cloud gives businesses
greater flexibility and more deployment options.

Management Information System:

MIS is the use of information technology, people, and business processes to record,
store and process data to produce information that decision makers can use to make
day to day decisions.

MIS is the acronym for Management Information Systems. In a nutshell, MIS is a

collection of systems, hardware, procedures and people that all work together to
process, store, and produce information that is useful to the organization.

The need for MIS

The following are some of the justifications for having an MIS system

Decision makers need information to make effective

decisions. Management Information Systems (MIS) make this possible.
MIS systems facilitate communication within and outside the
organization employees within the organization are able to easily access the
required information for the day to day operations. Facilitates such as Short
Message Service (SMS) & Email make it possible to communicate with
customers and suppliers from within the MIS system that an organization is
using.
Record keeping management information systems record all business
transactions of an organization and provide a reference point for the
transactions.

Components of MIS

The major components of a typical management information system are;

People people who use the information system

Data the data that the information system records
Business Procedures procedures put in place on how to record, store and
analyze data
Hardware these include servers, workstations, networking equipment,
printers, etc.
Software these are programs used to handle the data. These include
programs such as spreadsheet programs, database software, etc.

Types of Information Systems

The type of information system that a user uses depends on their level in an
organization. The following diagram shows the three major levels of users in an
organization and the type of information system that they use.

Transaction Processing Systems (TPS)

This type of information system is used to record the day to day transactions of a
business. An example of a Transaction Processing System is a Point of Sale (POS)
system. A POS system is used to record the daily sales.

Management Information Systems (MIS)

Management Information Systems are used to guide tactic managers to make semi-
structured decisions. The output from the transaction processing system is used as
input to the MIS system.

Decision Support Systems (DSS)

Decision support systems are used by top level managers to make semi-structured
decisions. The output from the Management Information System is used as input to the
decision support system.DSS systems also get data input from external sources such
as current market forces, competition, etc.

Manual Information Systems VS Computerized Information Systems (MIS)

Data is the bloodstream of any business entity. Everyone in an organization needs

information to make decisions. An information system is an organized way of
recording, storing data, and retrieving information.

In this section, we will look at manual information systems vs. computerized

information systems.

Manual Information System

A manual information system does not use any computerized devices. The recording,
storing and retrieving of data is done manually by the people, who are responsible for
the information system.

The following are the major components of a manual information system

People people are the recipients of information system

Business Procedures these are measures put in place that define the rules
for processing data, storing it, analyzing it and producing information
Data these are the recorded day to day transactions
Filing system this is an organized way of storing information
Reports the reports are generated after manually analyzing the data from the
filing system and compiling it.

Computerized information system

Computerized systems were developed to address the challenges of manual

information systems. The major difference between a manual and computerized
information system is a computerized system uses a combination of software and
hardware to record, store, analyze and retrieve information
Computer Fundamental Some Facts
Set-1

1. Control key is used in combination with another key to perform a specific task
2. Scanner will translate images of text, drawings and photos into digital form
3. CPU is the brain of the computer
4. Something which has easily understood instructions is said to be user friendly
5. Information on a computer is stored as digital data
6. For creating a document, you use new command at file menu
7. The programs and data kept in main memory while the processor is using them
8. Ctrl + A command is used to select the whole document
9. Sending an e-mail is same as writing a letter
10. A Website address is a unique name that identifies a specific website on the web

11. Answer sheets in bank POs/Clerks examinations are checked by using Optical
Mark Reader
12. Electronic data exchange provides strategic and operational business opportunity
13. Digital signals used in ISDN have whole number values

14. Digitizer is an input device

15. Caramel is the latest platform of Intel Centrio microprocessor
16. RISC is known as storage device
17. NORTON is an anti-virus
18. The system file of an operating system is COM
19. ATMs of bank have real currency
20. A program that converts high level language to machine language is assembler
21. .txt files can be made in notepad, MS word, DOS editor
22. .Zip is a compressed file
23. Internet is a WAN
24. MP3 technology compresses a sound sequence to one-twelfth of its original size
25. At a time only one operating system can be at work on a computer
26. If multiple programs can be executed at the same time, it is distributed
operating system
27. If the operating system provides quick attention, it is real time operating system
28. Distributed operating system uses network facility
29. FORMAT command in MS-DOS is used for recreating disk information

30. COPY command in MS-DOS is used to copy one or more files in disk drive to
another, copy from one directory to another directory
31. REN command is Internal command
32. Tom Burners-Li propounded the concept of World wide web

33. The memory address sent from the CPU to the main memory over a set of
wires is called address bus
34. MODEM is an electronic device required the computer to connect to the INTERNET
35. A source program is a program which is to be Tran scripted in machine language
36. Virus in computer relates to program
37. Floppy is not a storage medium in the computer related hardware
38. DOS floppy disk does not have a boot record
39. The CPU in a computer comprises of Store, arithmetic and logical unit and control
unit
40. In computer parlor a mouse is a screen saver

41. OMR is used to read choice filled up by the student in common entrance test
42. A network that spreads over cities is WAN
43. File Manager is not a part of a standard office suite
44. A topology of computer network means cabling between PCs
45. In UNIX command Ctrl + Z is used to suspend current process or command
46. Word is the word processor in MS Office
47. Network layer of an ISO-OSI reference model is for networking support
48. Telnet helps in remote login
49. MS Word allows creation of .DOC type of documents by default
50. In case of MS-access, the rows of a table correspond to records
51. Record maintenance in database is not a characteristic of E-mail

52. In a SONET system, an add/drop multipliers removes noise from a signal and can
also add/remove headers

53. The WWW standard allows grammars on many different computer platforms to
show the information on a server. Such programmers are called Web Browsers
54. One of the oldest calculating device was abacus
55. Paint art is not a special program in MS Office
56. Outlook Express is a e-mail client, scheduler, address book
57. The first generation computers had vacuum tubes and magnetic drum
58. Office Assistant is an animated character that gives help in MSOffice
59. Alta Vista has been created by research facility of Digital Electronic corporation of USA

60. We are shifting towards computerization because technologies help in meeting

the business objectives

61. Spiders search engines continuously send out that starts on a homepage of a
server and pursue all links stepwise
62. Static keys make a network insecure
63. Joy Stick is an input device that cannot be used to work in MS Office

64. Artificial intelligence can be used in every sphere of life because of its ability to
think like human beings

65. To avoid the wastage of memory, the instruction length should be of word size
which is multiple of character size

Set-2

1. A set of computer programs used for a certain function such as word processing is
the best definition of a software package
2. You can start Microsoft word by using start button

3. A blinking symbol on the screen that shows where the next character will appear
is a cursor
4. Highlight and delete is used to remove a paragraph from a report you had written
5. Data and time are available on the desktop at taskbar
6. A directory within a directory is called sub directory
7. Testing is the process of finding errors in software code
8. In Excel, charts are created using chart wizard option

9. Microcomputer hardware consists of three basic categories of physical

equipment system unit, input/output, memory
10. Windows is not a common feature of software applications

11. A tool bar contains buttons and menus that provide quick access to commonly
used commands

Subscribe Study Regular YouTube

Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
12. For creating a document, you use new command at file menu
13. Input device is equipment used to capture information and commands

14. A programming language contains specific rules and words that express the
logical steps of an algorithm
15. One advantage of dial-up internet access is it utilizes existing telephone security
16. Protecting data by copying it from the original source is backup
17. Network components are connected to the same cable in the star topology
18. Two or more computers connected to each other for sharing information form a network

19. A compute checks the database of user name and passwords for a match
before granting access
20. Computers that are portable and convenient for users who travel are known as
laptops
21. Spam is the term for unsolicited e-mail

22. Utility software type of program controls the various computer parts and allows the
user to interact with the computer

23. Each cell in a Microsoft office excel document is referred to by its cell address,
which is the cells row and column labels
24. Eight digit binary number is called a byte

25. Office LANs that are spread geographically apart on a large scale can be
connected using a corporate WAN

26. Storage is the process of copying software programs from secondary storage
media to the hard disk
27. The code for a web page is written using Hyper Text Markup Language

28. Small application programs that run on a Web page and may ensure a
form is completed properly or provide animation are known as flash

29. In a relational database, table is a data structure that organizes the information
about a single topic into rows and columns
30. The first computers were programmed using assembly language
31. When the pointer is positioned on a hyperlink it is shaped like a hand

32. Booting process checks to ensure the components of the computer are operating
and connected properly

33. Checking the existing files saved on the disk the user determine what programs
are available on a computer
34. Special effect used to introduce slides in a presentation are called animation
35. Computers send and receive data in the form of digital signals
36. Most World Wide Web pages contain HTML commands in the language
37. Icons are graphical objects used to represent commonly used application
38. UNIX is not owned and licensed by a company

39. In any window, the maximize button, the minimize button and the close buttons
appear on the title bar
40. Dial-up Service is the slowest internet connection service
41. Every component of your computer is either hardware or software

42. Checking that a pin code number is valid before it is entered into the system is
an example of data validation

43. A compiler translates higher level programs into a machine language program,
which is called object code
44. The ability to find an individual item in a file immediately direct access

45. Computers connected to a LAN can share information and/or share

peripheral equipment
46. A CD-RW disk can be erased and rewritten
47. The two major categories of software include system and application
48. Windows 95, Windows 98 and Windows NT are known as operating systems
49. Information on a computer is stored as analog data

50. A spread sheet that works like a calculator for keeping track of money and
making budgets

51. To take information from one source and bring it to your computer is referred to
as download
52. Each box in a spread sheet is called a cell
53. Network components are connected to the same cable in the bus topology

54. Two or more computers connected to each other for sharing information
form a network

55. A computer checks the database of user names and passwords for a match
before granting access.
56. Spam is the other name for unsolicited e-mail

57. Operating system controls the various computer parts and allows the user to
interact with the computer
58. Each cell in a Microsoft Office Excel document is referred to by its cell address,
which is the cells row and column labels

59. Download is the process of copying software programs from secondary storage
media to the hard disk
60. The code for a web page is written using Hypertext Markup Language

61. Small application programs that run on a web page and may ensure a form
is completed properly or provide animation are known as Flash
62. A file is a unique name that you give to a file of information
63. For seeing the output, you use monitor
64. CDs are of round in shape
65. Control key is used in combination with another key to perform a specific task
66. Scanner will translate images of text, drawings and photos into digital form
67. CPU is the brain of the computer
68. Something which has easily understood instructions is said to be user friendly
69. Information on a computer is stored as digital data
70. For creating a document, you use new command at file menu
71. The programs and data kept in main memory while the processor is using them
72. Ctrl + A command is used to select the whole document
73. Sending an e-mail is same as writing a letter
74. A Website address is a unique name that identifies a specific website on the web

75. Answer sheets in bank POs/Clerks examinations are checked by using Optical
Mark Reader
76. Electronic data exchange provides strategic and operational business opportunity
77. Digital signals used in ISDN have whole number values
78. Assembler is language translation software
79. Manual data can be put into computer by scanner
80. In a bank, after computerization cheques are taken care of by MICR
81. The banks use MICR device to minimize conversion process
82. Image can be sent over telephone lines by using scanner
83. Microchip elements are unique to a smart card
84. MS-DOS is a single user operating system
85. Basic can be used for scientific and commercial purpose

Set-3

1. .bas, .doc and .htm are examples of extensions

2. Codes consisting of bars or lines of varying widths or lengths that are

computer readable are known as a bar code

3. Convenience, speed of delivery, generally and reliability are all considered as

the advantages of e-mail
4. E-commerce allows companies to conduct business over the internet
5. The most important or powerful computer in a typical network is network server

6. To make a notebook act as a desktop model, the notebook can be connected

to a docking station which is connected to a monitor and other devices

7. Storage that retains its data after the power is turned off is referred to as non-
volatile storage
8. Virtual memory is memory on the hard disk that the CPU uses as an extended
RAM
9. To move to the beginning of a line of text, press the home key
10. When sending and e-mail, the subject line describes the contents of the message
11. Microsoft is an application suite
12. Information travels between components on the motherboard through bays
13. One advantage of dial-up internet access is it utilizes existing telephone security
14. Network components are connected to the same cable in the star topology

15. Booting checks to ensure the components of the computer are operating
and connected properly
16. Control key is used in combination with another key to perform a specific task
17. Scanner will translate images of text, drawings, and photos into digital form
18. Information on a computer is stored as digital data
19. The programs and data kept in main memory while the processor is using them
20. Storage unit provide storage for information and instruction
21. Help menu button exist at start
22. Microsoft company developed MS Office 2000
23. Charles Babbage is called the father of modern computing

24. Data link layer of OSI reference model provides the service of error detection
and control to the highest layer
25. Optical fiber is not a network
Subscribe Study Regular YouTube
Channel and Join Our Facebook Group
For MCQ and Understand these Topic
Concepts
26. OMR is used to read choice filled up by the student in common entrance test
27. A network that spreads over cities is WAN
28. File Manager is not a part of a standard office suite
29. A topology of computer network means cabling between PCs
30. In UNIX command Ctrl + Z is used to suspend current process or command
31. Word is the word processor in MS Office
32. Network layer of an ISO-OSI reference model is for networking support
33. Telnet helps in remote login
34. MS Word allows creation of .DOC type of documents by default
35. In case of MS-access, the rows of a table correspond to records
36. Record maintenance in database is not a characteristic of E-mail

37. In a SONET system, an add/drop multipliers removes noise from a signal and can
also add/remove headers

38. The WWW standard allows grammars on many different computer platforms to
show the information on a server. Such programmers are called Web Browsers
39. One of the oldest calculating device was abacus
40. Paint art is not a special program in MS Office
41. Outlook Express is a e-mail client, scheduler, address book
42. The first generation computers had vacuum tubes and magnetic drum
43. Office Assistant is an animated character that gives help in MSOffice
44. Alta Vista has been created by research facility of Digital Electronic corporation of USA

45. We are shifting towards computerization because technologies help in meeting

the business objectives

46. Spiders search engines continuously send out that starts on a homepage of a
server and pursue all links stepwise
47. Static keys make a network insecure
48. Joy Stick is an input device that cannot be used to work in MS Office

49. Artificial intelligence can be used in every sphere of life because of its ability to
think like human beings

50. To avoid the wastage of memory, the instruction length should be of word size which
is multiple of character size
51. Electronic fund transfer is the exchange of money from one account to another
52. Format menu in MS Word can be use to change page size and typeface
53. Assembly language programs are written using Mnemonics
54. DMA module can communicate with CPU through cycle stealing

55. A stored link to a web page, in order to have a quick and easy access to it later, is called
bookmark

56. B2B type of commerce is characterized by low volume and high value transactions in banking
57. Advanced is not a standard MS Office edition
58. Workstation is single user computer with many features and good processing power

59. History list is the name of list that stores the URLs of web pages and links visited in past few
days
60. FDDI access mechanism is similar to that of IEEE 802.5
61. MS Office 2000 included a full-fledged web designing software are called FrontPage
62. 2000
63. Macintosh is IBMs microcomputer
64. X.21 is physical level standard for X.25
65. Enter key should be pressed to start a new paragraph in MS Word
66. Main frame is most reliable, robust and has a very high processing power.
67. Formatting of these toolbars allows changing of Fonts and their sizes
68. The ZZ command is used to quit editor after saving

69. The program supplied by VSNL when you ask for internet connection for the e-mail access is
pine
70. The convenient place to store contact information for quick, retrieval is address book
71. Digital cash is not a component of an e-wanet

72. For electronic banking, we should ensure the existence and procedures with regard to identification
of customers who become members electronically
73. Jon Von Neumann developed stored-program concept
74. Hardware and software are mandatory parts of complete PC system
75. Firewall is used in PC for security
76. Two rollers are actually responsible for movement of the cursor in mouse
Subscribe Study Regular YouTube Channel
and Join Our Facebook Group
For MCQ and Understand these Topic Concepts

Bion W R - Complite Works Vol II
100% (9)
Bion W R - Complite Works Vol II
272 pages
Keys To Successful Landscape Painting
86% (14)
Keys To Successful Landscape Painting
168 pages
Applied Mathematics of Fluid Flow in Porous Media
No ratings yet
Applied Mathematics of Fluid Flow in Porous Media
94 pages
Stephen Jay Gould - Reflections On His View of Life
No ratings yet
Stephen Jay Gould - Reflections On His View of Life
415 pages
User Friendly Multivariate Calibration GP
100% (4)
User Friendly Multivariate Calibration GP
354 pages
R3 - Communicating Ethically Character, Duties, Consequences, and Relationships (William W. Neher)
100% (2)
R3 - Communicating Ethically Character, Duties, Consequences, and Relationships (William W. Neher)
361 pages
Piping Supervisor
No ratings yet
Piping Supervisor
12 pages
CSR Matrix Iso Ts GM Ford Fca-Regulations-Cqi-111516
80% (5)
CSR Matrix Iso Ts GM Ford Fca-Regulations-Cqi-111516
142 pages
Database Concepts Notes
No ratings yet
Database Concepts Notes
48 pages
Contemporary Week 3 DLL
100% (1)
Contemporary Week 3 DLL
3 pages
DBMS Lecture Notes
No ratings yet
DBMS Lecture Notes
120 pages
Residuary Power of The Union
No ratings yet
Residuary Power of The Union
11 pages
DBMS Repeated From Book
No ratings yet
DBMS Repeated From Book
74 pages
He Weiling - Flatness Transformed
No ratings yet
He Weiling - Flatness Transformed
360 pages
DBMS Exam
100% (1)
DBMS Exam
44 pages
Unit 1
No ratings yet
Unit 1
95 pages
RDBMS Notes
No ratings yet
RDBMS Notes
227 pages
ER and Normalization
No ratings yet
ER and Normalization
158 pages
Unit 1 DBMS
No ratings yet
Unit 1 DBMS
66 pages
Traffic Light Signal
No ratings yet
Traffic Light Signal
42 pages
CSC 303
No ratings yet
CSC 303
132 pages
DBMS Notes
No ratings yet
DBMS Notes
93 pages
Getting To Know Your Thoughts
100% (1)
Getting To Know Your Thoughts
199 pages
Chapter 5 CSR
No ratings yet
Chapter 5 CSR
28 pages
Dbms Overview
No ratings yet
Dbms Overview
102 pages
Database Management Systems
No ratings yet
Database Management Systems
53 pages
Dbms March
No ratings yet
Dbms March
98 pages
4th Sem DBMS
No ratings yet
4th Sem DBMS
125 pages
DBMS
No ratings yet
DBMS
95 pages
Advantages of Databases
No ratings yet
Advantages of Databases
5 pages
DBMS Part 1
No ratings yet
DBMS Part 1
72 pages
The Status of Women in Islam
No ratings yet
The Status of Women in Islam
47 pages
Unit - I Data Models and Querying
No ratings yet
Unit - I Data Models and Querying
78 pages
DBMS
No ratings yet
DBMS
113 pages
Dbms Unit-1 - Important Points
No ratings yet
Dbms Unit-1 - Important Points
58 pages
Dbms 15 Marks
No ratings yet
Dbms 15 Marks
44 pages
Zeal Institute of Manangement and Computer Application
No ratings yet
Zeal Institute of Manangement and Computer Application
4 pages
Adbms Imp
No ratings yet
Adbms Imp
25 pages
The Crucible Casting Project
No ratings yet
The Crucible Casting Project
4 pages
21st Archetypes
No ratings yet
21st Archetypes
30 pages
Data Baselecture 1
No ratings yet
Data Baselecture 1
51 pages
Introduction of DBMS
No ratings yet
Introduction of DBMS
89 pages
Database Concepts Notes
No ratings yet
Database Concepts Notes
48 pages
My SQL
No ratings yet
My SQL
28 pages
Abraxas and Anna Perenna PDF
No ratings yet
Abraxas and Anna Perenna PDF
16 pages
DBMS
No ratings yet
DBMS
63 pages
Unit No1 (Database System Concepts)
No ratings yet
Unit No1 (Database System Concepts)
32 pages
Unit 1 Data and Information
No ratings yet
Unit 1 Data and Information
27 pages
Unit I DBMS
No ratings yet
Unit I DBMS
23 pages
French Revolution Unit
No ratings yet
French Revolution Unit
4 pages
Ch1 Introduction
No ratings yet
Ch1 Introduction
31 pages
Beyond 'Understanding' To Skilful Play in Games, Through Play Practice
No ratings yet
Beyond 'Understanding' To Skilful Play in Games, Through Play Practice
12 pages
Principals of Database (22321) : Prepared by Ms. Khan Sameera I/C If Hod M.H.Saboo Siddik Polytechnic
No ratings yet
Principals of Database (22321) : Prepared by Ms. Khan Sameera I/C If Hod M.H.Saboo Siddik Polytechnic
56 pages
CH-01 Unit-01 DMS
No ratings yet
CH-01 Unit-01 DMS
23 pages
Dbms Module 1 Notes
No ratings yet
Dbms Module 1 Notes
19 pages
Dbms Part 1
No ratings yet
Dbms Part 1
29 pages
PC03 DBMS Notes V-2 by Rajeev
No ratings yet
PC03 DBMS Notes V-2 by Rajeev
17 pages
Database Concepts
No ratings yet
Database Concepts
16 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
19 pages
Advance DB Notes V2
No ratings yet
Advance DB Notes V2
13 pages
Lecture 2
No ratings yet
Lecture 2
20 pages
Unit - 1
No ratings yet
Unit - 1
36 pages
Lecture #1: by Mohsin Riaz
No ratings yet
Lecture #1: by Mohsin Riaz
43 pages
DBMS & SQL
No ratings yet
DBMS & SQL
14 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
Unit - 3 Unit - 3 Data Base Managemant Data Base Managemant
No ratings yet
Unit - 3 Unit - 3 Data Base Managemant Data Base Managemant
20 pages
DBMS Proper
No ratings yet
DBMS Proper
22 pages
Dbms (Cse201) Theory Notes: Primary Key
No ratings yet
Dbms (Cse201) Theory Notes: Primary Key
18 pages
Managing Database Systems
No ratings yet
Managing Database Systems
14 pages
1 New Unit I Date
No ratings yet
1 New Unit I Date
15 pages
Dbms Notes
No ratings yet
Dbms Notes
13 pages
Data Abstraction
No ratings yet
Data Abstraction
7 pages
Chapter 1 Data Merise
No ratings yet
Chapter 1 Data Merise
6 pages
DBMS Notes
No ratings yet
DBMS Notes
12 pages
DBMS Notes 2
No ratings yet
DBMS Notes 2
7 pages
DBMS Short Note (JUMAIL)
No ratings yet
DBMS Short Note (JUMAIL)
4 pages
DBMS Notes
No ratings yet
DBMS Notes
11 pages
List of Publishers - Scholarly Open Access
No ratings yet
List of Publishers - Scholarly Open Access
21 pages
Dbms Notes
No ratings yet
Dbms Notes
9 pages
Imp Ans
No ratings yet
Imp Ans
7 pages
Dbms 111
No ratings yet
Dbms 111
5 pages
Discourse Markers - Connectors: at The Beginning in Addition Moreover
No ratings yet
Discourse Markers - Connectors: at The Beginning in Addition Moreover
17 pages
A Framework For Sustainable Product Design A Hybrid Fuzzy Approach Based On QFD For Environment Younesi2015
No ratings yet
A Framework For Sustainable Product Design A Hybrid Fuzzy Approach Based On QFD For Environment Younesi2015
15 pages
Astro
No ratings yet
Astro
10 pages
A. Total 30 Values
No ratings yet
A. Total 30 Values
8 pages
Character of Paul Morel
No ratings yet
Character of Paul Morel
7 pages
Western Political Thought 5th
No ratings yet
Western Political Thought 5th
3 pages
An Output Presented To Ms. Kate Daren E. Montecastro Faculty, High School Department St. Anthony's College
No ratings yet
An Output Presented To Ms. Kate Daren E. Montecastro Faculty, High School Department St. Anthony's College
3 pages
Quiz RW Patterns of Development
No ratings yet
Quiz RW Patterns of Development
2 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet