Informatics Practices Xi
Informatics Practices Xi
2020-21
NCERT Campus
Sri Aurobindo Marg
New Delhi 110 016 Phone : 011-26562708
108, 100 Feet Road
Hosdakere Halli Extension
Banashankari III Stage
Bengaluru 560 085 Phone : 080-26725740
Navjivan Trust Building
P.O.Navjivan
Ahmedabad 380 014 Phone : 079-27541446
140.00 CWC Campus
Opp. Dhankal Bus Stop
Panihati
Kolkata 700 114 Phone : 033-25530454
CWC Complex
Maligaon
Guwahati 781 021 Phone : 0361-2674869
Publication Team
Head, Publication : M. Siraj Anwar
Division
Printed on 80 GSM paper with NCERT Chief Business Manager : Bibash Kumar Das
watermark
Chief Production Officer : Arun Chitkara
Published at the Publication Division Editor : Bijnan Sutar
by the Secretary, National Council
of Educational Research and Production Assistant : Mukesh Gaur
Training, Sri Aurobindo Marg, New
Delhi 110016 and printed at Swan Cover Design and Layout
Press, 308 & 309, Sector-7 Manesar, Meetu Sharma (Contractul)
Gurugram - 122050, Haryana.
2020-21
Hrushikesh Senapaty
Director
New Delhi National Council of Educational
July 2019 Research and Training
2020-21
2020-21
2020-21
Members
Anuradha Khattar, Assistant Professor, Miranda House, University of
Delhi, Delhi
Chetna Khanna, Freelance Educationist, Delhi
Gurpreet Kaur, PGT (Computer Science), GD Goenka Public School, Delhi
Harita Ahuja, Assistant Professor, Acharya Narendra Dev College,
University of Delhi, Delhi
Mudasir Wani, Assistant Professor, Govt. Degree College for Women,
Srinagar, Jammu and Kashmir
Om Vikas, Professor (Retd.), Formerly Director, ABV-IIITM, Gwalior,
Madhya Pradesh
Priti Rai Jain, Assistant Professor, Miranda House, University of Delhi,
Delhi
Rinku Kumari, PGT (Computer Science), Kendriya Vidyalaya, Sainik Vihar,
Delhi
Sharanjit Kaur, Associate Professor, Acharya Narendra Dev College,
University of Delhi, Delhi
Tapasi Ray, Formerly Global IT Director, Huntsman Corporation, Singapore
Member-Coordinator
Rejaul Karim Barbhuiya, Assistant Professor, DESM, NCERT, Delhi
2020-21
2020-21
2020-21
2020-21
Computer Chapter
System 1
In this chapter
»» Introduction to
Computer System
»» Evolution of Computer
“A computer would deserve to be called
»» Computer Memory
intelligent if it could deceive a human into
»» Software
believing that it was human.”
— Alan Turing
Computer System 3
EDVAC/ENIAC
Pascaline John Von Neumann introduced
Blaize Pascal invented a the concept of stored program
mechanical calculator known computer which was capable of
as Pascal calculator or storing data as well as program
Pascaline to do addition and in the memory. The EDVAC and
subtraction of two numbers then the ENIAC computers were
directly and multiplication and developed based on this concept.
division through repeated
addition or subtraction.
Tabulating Machine Integrated Circuit
Herman Hollerith designed An Integrated Circuit (IC) is
a tabulating machine for a silicon chip which contains
1642 summarising the data stored 1945 entire electronic circuit on a
on the punched card. It is very small area. The size of
considered to be the first computer has drastically
step towards programming. reduced because of ICs.
1890 1970
1834 1947
Computer System 5
10,000,000,000
Number of Transistors
per Integrated Circuit
1,000,000,000 Core 2 DUO Core i7
100,000,000 Intel Microprocessors Pentium IV
Pentium II Pentium III
10,000,000 Pentium
1,000,000 486
486
100,000 Invention of the 386
Transistor 286
10,000 8086
1,000 4004
100 Doubles every 2 years
10
1
1940 1950 1960 1970 1980 1990 2000 2010 2020
Figure 1.6: Exponential Increase in Number of Transistors used in ICs Over Time
based operating systems by Microsoft and others in
place of computers with only command line interface,
like UNIX or DOS. Around 1990s, the growth of world
wide web (WWW) further accelerated mass usage of
computers and thereafter computers have become an
indispensable part of everyday life.
Further, with the introduction of laptops, personal
computing was made portable to a great extent. This In 1965, Intel co-
was followed by smartphones, tablets and other founder Gordon Moore
personal digital assistants. These devices have introduced Moore’s
Law which predicted
leveraged the technological advancements in processor
that the number of
miniaturisation, faster memory, high speed data and transistors on a chip
connectivity mechanisms. would double every two
The next wave of computing devices includes years while the costs
wearable gadgets such as smart watch, lenses, would be halved.
headbands, headphones, etc. Further, smart appliances
are becoming a part of the Internet of Things (IoT), by
leveraging the power of artificial intelligence.
Computer System 7
Computer System 9
1.4 Software
Till now, we have studied about the physical
components or the hardware of the computer system.
But the hardware is of no use on its own. Hardware
needs to be operated by a set of instructions. These
sets of instructions are referred to as software. It is that
component of a computer system, which we cannot
touch or view physically. It comprises of the instructions
and data to be processed using the computer hardware.
The computer software and hardware complete any
task together. Hardware refers to the
The software comprises of set of instructions which physical components
on execution deliver the desired outcome. In other of the computer system
which can be seen and
words, each software is written for some computational touched. For example,
purpose. Some examples of software include operating RAM, keyboard,
systems like Ubuntu or Windows 7/10, word processing printer, monitor, CPU
tools like LibreOffice Writer or Microsoft Word, video etc. On the other hand,
player like VLC Player, photo editors like Paint and software is a set of
instructions and data
LibreOffice Draw. A document or image stored on the that makes hardware
hard disk or pen drive is referred to as a softcopy. Once functional to complete
printed, the document or an image is called a hardcopy. the desired task.
Computer System 11
Computer System 13
Summary Notes
• A computing device, also referred as computer
processes the input data as per given instructions
to generate desired output.
• Computer processes data to generate information
whose further analysis and interpretation yields
knowledge.
• Computer system has four physical components
viz. i) CPU ii) Primary memory iii) Input device and
iv) Output device. They are referred to as hardware
of computer.
• Computer system has two types of primary
memories viz. i) RAM the volatile memory and ii)
ROM the non-volatile memory.
• Software is a set of instructions written to achieve
the desired task and are mainly categorised as
system software, programming tools and application
software.
• Hardware of a computer cannot function on its own.
It needs software to be operational or functional.
• Operating system is an interface between the user
and the computer and supervises the working of
computer system i.e. it monitors and controls the
hardware and software of the computer system.
Exercise
1. Name the software required to make a computer
functional. Write down its two primary functions.
2. What is the need of RAM? How does it differ from ROM?
3. What is the need for secondary memory?
4. Draw the block diagram of a computer system. Briefly
write about the functionality of each component.
5. Differentiate between proprietary software and freeware
software. Name two software of each type.
6. Mention any browsers used for browsing the internet.
Emerging Chapter
Trends 2
In this chapter
» Introduction to
Emerging Trends
» Artificial Intelligence (AI)
“Computer science is no more about
» Big Data
computers than astronomy is about
» Internet of Things (IoT)
telescopes” » Cloud Computing
» Grid Computing
— Edsger Dijkstra » Blockchains
Activity 2.1
The predictive typing feature of search engine that
helps us by suggesting the next word in the sentence
Find out how NLP is
while typing keywords and the spell checking features
helping differently-
abled persons? are examples of Natural Language Processing (NLP).
It deals with the interaction between human and
emergIng trends 17
emergIng trends 19
Figure 2.5: NASA’s Mars Figure 2.6: Sophia : a Figure 2.7: an unmanned
Exploration Rover (MER) Humanoid aircraft
software-controlled flight plans in their embedded
systems, working in conjunction with onboard
sensors and GPS (Figure 2.7). Drones are being
used in many fields, such as journalism, filming Think and Reflect
and aerial photography, shipping or delivery at short Can a drone be helpful
distances, disaster management, search and rescue in the event of a
operations, healthcare, geographic mapping and natural calamity?
structural safety inspections, agriculture, wildlife
monitoring or pooching, besides law-enforcement and
border patrolling.
emergIng trends 21
emergIng trends 23
emergIng trends 25
2.7 BlockchAIns
Traditionally, we perform digital transactions by storing
data in a centralised database and the transactions
performed are updated one by one on the database. That
is how the ticket booking websites or banks operate.
However, since all the data is stored on a central
location, there are chances of data being hacked or lost.
The blockchain technology works on the concept of
decentralised and shared database where each computer
emergIng trends 27
The request is
Someone requests broadcast to all
a transaction nodes in the
network
summAry
• Artificial Intelligence endeavours to simulate the
natural intelligence of human beings into machines
thus making them intelligent.
• Machine learning comprises of algorithms that use
data to learn on their own and make predictions.
• Natural language processing (NLP) facilitates
communicating with intelligent systems using a
natural language.
• Virtual reality allows a user to look at, explore, and
interact with the virtual surroundings, just like one
can do in the real world.
• The superimposition of computer-generated
perceptual information over the existing physical
surroundings is called augmented reality.
• Robotics can be defined as the science primarily
associated with the design, fabrication, theory, and
application of robots.
• Big data holds rich information and knowledge which
can be of high business value. Five characteristics
of big data are: Volume, Velocity, Variety, Veracity,
and Value.
• Data analytics is the process of examining data sets
in order to draw conclusions about the information
they contain.
• The Internet of Things (IoT) is a network of devices
that have an embedded hardware and software to
communicate (connect and exchange data) with
other devices on the same network.
• A sensor is a device that takes input from the
physical environment and uses built-in computing
resources to perform predefined functions upon
detection of specific input and then processes data
before passing it on.
emergIng trends 29
ExErcIsE
1. List some of the cloud-based services that you are using
at present.
2. What do you understand by the Internet of Things? List
some of its potential applications.
3. Write a short note on the following:
a) Cloud computing
b) Big data and its characteristics
4. Explain the following along with their applications.
a) Artificial Intelligence
b) Machine Learning
5. Differentiate between cloud computing and grid
computing with suitable examples.
6. Justify the following statement-
‘Storage of data is cost effective and time saving in cloud
computing.’
7. What is on-demand service? How it is provided in cloud
computing?
8. Write examples of the following:
a) Government provided cloud computing platform
b) Large scale private cloud service providers and the
services they provide
9. A company interested in cloud computing is looking for a
provider who offers a set of basic services such as virtual
server provisioning and on-demand storage that can be
combined into a platform for deploying and running
customised applications. What type of cloud computing
model fits these requirements?
a) Platform as a Service
b) Software as a Service
c) Infrastructure as a Service
of Python 3
In this chapter
»» Introduction to Python
»» Python Keywords
“Don't you hate code that's not properly »» Identifiers
indented? Making it [indenting] part of »» Variables
the syntax guarantees that all code is »» Data Types
properly indented.” »» Operators
»» Expressions
— G. van Rossum »» Input and Output
»» Debugging
»» Functions
»» if..else Statements
3.1 Introduction to Python »» for Loop
»» Nested Loops
An ordered set of instructions or commands to be
executed by a computer is called a program. The
language used to specify those set of instructions
to the computer is called a programming language
for example Python, C, C++, Java, etc.
This chapter gives a brief overview of Python
programming language. Python is a very popular
and easy to learn programming language, created
by Guido van Rossum in 1991. It is used in a
variety of fields, including software development,
web development, scientific computing, big data
3.3 Identifiers
In programming languages, identifiers are names used
to identify a variable, function, or other entities in a
program. The rules for naming an identifier in Python
are as follows:
• The name should begin with an uppercase or a
lowercase alphabet or an underscore sign (_). This
may be followed by any combination of characters
a-z, A-Z, 0-9 or underscore (_). Thus, an identifier
cannot start with a digit.
• It can be of any length. (However, it is preferred to
keep it short and meaningful).
• It should not be a keyword or reserved word given in
Table 3.1.
• We cannot use special symbols like !, @, #, $, %, etc.
in identifiers.
For example, to find the average of marks obtained
by a student in three subjects namely Maths, English,
Informatics Practices (IP), we can choose the identifiers
as marksMaths, marksEnglish, marksIP and avg
rather than a, b, c, or A, B, C, as such alphabets do not
give any clue about the data that variable refers to.
avg = (marksMaths + marksEnglish + marksIP)/3
3.4 Variables
Variable is an identifier whose value can change. For
example variable age can have different value for
different person. Variable name should be unique in a
program. Value of a variable can be string (for example,
Dictionaries
Example 3.2
#To create a list
>>> list1 = [5, 3.4, "New Delhi", "20C", 45]
#print the elements of the list list1
>>> list1
[5, 3.4, 'New Delhi', '20C', 45]
(C) Tuple
Tuple is a sequence of items separated by commas and
items are enclosed in parenthesis ( ). This is unlike list,
where values are enclosed in brackets [ ]. Once created,
we cannot change items in the tuple. Similar to List,
items may be of different data types.
Example 3.3
#create a tuple tuple1
>>> tuple1 = (10, 20, "Apple", 3.4, 'a')
#print the elements of the tuple tuple1
>>> print(tuple1)
(10, 20, "Apple", 3.4, 'a')
3.5.3 Mapping
Mapping is an unordered data type in Python. Currently,
there is only one standard mapping data type in Python
called Dictionary.
(A) Dictionary
Dictionary in Python holds data items in key-value pairs
and Items are enclosed in curly brackets { }. dictionaries
permit faster access to data. Every key is separated from
its value using a colon (:) sign. The key value pairs of
a dictionary can be accessed using the key. Keys are
usually of string type and their values can be of any data
type. In order to access any value in the dictionary, we
have to specify its key in square brackets [ ].
Example 3.4
#create a dictionary
>>> dict1 = {'Fruit':'Apple',
'Climate':'Cold', 'Price(kg)':120}
>>> print(dict1)
{'Fruit': 'Apple', 'Climate': 'Cold',
'Price(kg)': 120}
#getting value by specifying a key
Python compares two >>> print(dict1['Price(kg)'])
strings lexicographically 120
(According to the
theory and practice of
composing and writing
3.6 Operators
dictionary), using ASCII An operator is used to perform specific mathematical or
value of the characters. logical operation on values. The values that the operator
If the first character of
works on are called operands. For example, in the
both strings are same,
the second character is expression 10 + num, the value 10, and the variable num
compared, and so on. are operands and the + (plus) sign is an operator. Python
supports several kind of operators, their categorisation
is briefly explained in this section.
3.6.1 Arithmetic Operators
Python supports arithmetic operators (Table 3.3) to
perform the four basic arithmetic operations as well as
modular division, floor division and exponentiation.
'+' operator can also be used to concatenate two
strings on either side of the operator.
>>> str1 = "Hello"
>>> str2 = "India"
>>> str1 + str2
'HelloIndia'
'*' operator repeats the item on left side of the
operator if first operand is a string and second operand
is an integer value.
>>> str1 = 'India'
>>> str1 * 2
'IndiaIndia'
3.7 Expressions
An expression is defined as a combination of constants,
variables and operators. An expression always evaluates
to a value. A value or a standalone variable is also
considered as an expression but a standalone operator
is not an expression. Some examples of valid expressions
are given below.
(i) num – 20.4 (iii) 23/3 -5 * 7(14 -2)
(ii) 3.0 + 3.14 (iv) "Global"+"Citizen"
3.9 Debugging
Due to errors, a program may not execute or may
generate wrong output. :
i) Syntax errors
ii) Logical errors
iii) Runtime errors
3.10 Functions
A function refers to a set of statements or instructions
grouped under a name that perform specified tasks.
For repeated or routine tasks, we define a function. A
function is defined once and can be reused at multiple
Example 3.13
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
#start value is given as 2
>>> list(range(2, 10))
[2, 3, 4, 5, 6, 7, 8, 9]
#step value is 5 and start value is 0
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25]
#step value is -1. Hence, decreasing
#sequence is generated
>>> list(range(0, -9, -1))
[0, -1, -2, -3, -4, -5, -6, -7, -8]
The function range() is often used in for loops for
generating a sequence of numbers.
Program 3-5 Program to print the multiples of 10 for
numbers in a given range.
#Program 3-5
#Print multiples of 10 for numbers in a range
for num in range(5):
if num > 0:
print(num * 10)
Output:
10
20
30
40
#Program 3-6
#Demonstrate working of nested for loops
for var1 in range(3):
print( "Iteration " + str(var1 + 1) + " of outer loop")
for var2 in range(2): #nested loop
print(var2 + 1)
print("Out of inner loop")
print("Out of outer loop")
Output:
Iteration 1 of outer loop
1
2
Out of inner loop
Iteration 2 of outer loop
1
2
Out of inner loop
Iteration 3 of outer loop
1
2
Out of inner loop
Out of outer loop
Summary Notes
• Python is an open-source, high level, interpreter-
based language that can be used for a multitude of
scientific and non-scientific computing purposes.
• Comments are non-executable statements in a
program.
• An identifier is a user defined name given to a
variable or a constant in a program.
• Process of identifying and removing errors from a
computer program is called debugging.
• Trying to use a variable that has not been assigned
a value gives an error.
• There are several data types in Python — integer,
boolean, float, complex, string, list, tuple, sets,
None and dictionary.
• Operators are constructs that manipulate the value
of operands. Operators may be unary or binary.
• An expression is a combination of values, variables,
and operators.
• Python has input() function for taking user input.
• Python has print() function to output data to a
standard output device.
• The if statement is used for decision making.
• Looping allows sections of code to be executed
repeatedly under some condition.
• for statement can be used to iterate over a range
of values or a sequence.
• The statements within the body of for loop are
executed till the range of values is exhausted.
Exercise
1. Which of the following identifier names are invalid and
why?
a) Serial_no. e) Total_Marks
b) 1st_Room f) total-Marks
c) Hundred$ g) _Percentage
d) Total Marks h) True
Notes
»» Introduction to List
»» List Operations
“Computer Science is a science of »» Traversing a List
abstraction – creating the right model for »» List Methods and Built-
a problem and devising the appropriate in Functions
4.2.2 Repetition
Python allows us to replicate the contents of a list using
repetition operator depicted by symbol *.
>>> list1 = ['Hello']
#elements of list1 repeated 4 times
>>> list1 * 4
['Hello', 'Hello', 'Hello', 'Hello']
4.2.3 Membership
The membership operator in checks if the element
is present in the list and returns True, else returns
False.
>>> list1 = ['Red','Green','Blue']
>>> 'Green' in list1
True
>>> 'Cyan' in list1
False
The Operator not in transpose returns True if the
element is not present in the list, else it returns False.
>>> list1 = ['Red','Green','Blue']
>>> 'Cyan' not in list1
True
>>> 'Green' not in list1
False
4.2.4 Slicing
Slicing operations allow us to create new list by taking
out elements from an existing list.
>>> list1 =['Red','Green','Blue','Cyan',
'Magenta','Yellow','Black']
#subject from indexes 2 to 5 of list 1
>>> list1[2:6]
['Blue', 'Cyan', 'Magenta', 'Yellow']
len() Returns the length of the list passed as >>> list1 = [10,20,30,40,50]
the argument >>> len(list1)
5
extend() Appends each element of the list passed >>> list1 = [10,20,30]
as argument at the end of the given list >>> list2 = [40,50]
>>> list1.extend(list2)
>>> list1
[10, 20, 30, 40, 50]
Program 4-1 Write a program to allow user to perform any those list operation
given in a menu. The menu is:
1. Append an element
2. Insert an element
3. Append a list to the given list
4. Modify an existing element
5. Delete an existing element from its position
6. Delete an existing element with a given value
7. Sort the list in the ascending order
8. Sort the list in descending order
9. Display the list.
#Program 4-1
#Menu driven program to do various list operations
myList = [22,4,16,38,13] #myList having 5 elements
choice = 0
For attempt in range (3): print ("Attempt number:", attempt)
print("The list 'myList' has the following elements", myList)
print("\nL I S T O P E R A T I O N S")
print(" 1. Append an element")
print(" 2. Insert an element at the desired position")
print(" 3. Append a list to the given list")
print(" 4. Modify an existing element")
print(" 5. Delete an existing element by its position")
print(" 6. Delete an existing element by its value")
#append element
if choice == 1:
element = eval(input("Enter the element to be appended: "))
myList.append(element)
print("The element has been appended\n")
The list 'myList' has the following elements [38, 22, 13, 4]
Attempt number : 3
L I S T O P E R A T I O N S
1. Append an element
2. Insert an element at the desired position
3. Append a list to the given list
4. Modify an existing element
5. Delete an existing element by its position
6. Delete an existing element by its value
7. Sort the list in ascending order
8. Sort the list in descending order
9. Display the list
ENTER YOUR CHOICE (1-9) 10
choice is not valid
position = -1
for i in range (0, lin (list1)
if list1[i] == num: #number is present
position = i+1 #save the position of number
if position == -1 :
print("Number",num,"is not present in the list")
else:
print("Number",num,"is present at",position + 1, "position")
Output:
How many numbers do you want to enter in the list
5
del() Deletes the item with the given >>> dict1 = {'Mohan':95,'Ram':89,
key 'Suhel':92, 'Sangeeta':85}
To delete the dictionary from the >>> del dict1['Ram']
memory we write: >>> dict1
del Dict_name
{'Mohan':95,'Suhel':92, 'Sangeeta': 85}
>>> dict1
NameError: name 'dict1' is not defined
(i) Delete the item from the dictionary, corresponding to the key 9. ‘ODD’
>>> del ODD[9]
>>> ODD
{1: 'One', 3: 'Three', 5: 'Five', 7: 'Seven'}
'Joseph' 24000
'Rahul' 30000
'Zoya' 25000
result = ''
for ch in num:
key = int(ch) #converts character to integer
value = numberNames[key]
Summary
• Lists are mutable sequences in Python, i.e. we can
change the elements of the list.
• Elements of a list are put in square brackets
separated by comma.
• List indexing is same as that of list and starts at 0.
Two way indexing allows traversing the list in the
forward as well as in the backward direction.
• Operator + concatenates one list to the end of other
list.
• Operator * repeats the content of a list by
specified number of times.
• Membership operator in tells if an element is
present in the list or not and not in does the
opposite.
• Slicing is used to extract a part of the list.
• There are many list manipulation methods. Few
are: len(), list(), append(), extend(), insert(), count(),
find(), remove(), pop(), reverse(), sort(), sorted(),
min(), max(), sum().
• Dictionary is a mapping (non scalar) data type. It
is an unordered collection of key-value pair; key-
value pair are put inside curly braces.
• Each key is separated from its value by a colon.
• Keys are unique and act as the index.
• Keys are of immutable type but values can be
mutable.
Notes
Exercise
1. What will be the output of the following statements?
a) list1 = [12,32,65,26,80,10]
list1.sort()
print(list1)
b) list1 = [12,32,65,26,80,10]
sorted(list1)
print(list1)
c) list1 = [1,2,3,4,5,6,7,8,9,10]
list1[::-2]
list1[:3] + list1[3:]
d) list1 = [1,2,3,4,5]
list1[len(list1)-1]
2. Consider the following list myList. What will be
the elements of myList after each of the following
operations?
myList = [10,20,30,40]
a) myList.append([50,60])
b) myList.extend([80,90])
3. What will be the output of the following code segment?
myList = [1,2,3,4,5,6,7,8,9,10]
for i in range(0,len(myList)):
if i%2 == 0:
print(myList[i])
4. What will be the output of the following code segment?
a) myList = [1,2,3,4,5,6,7,8,9,10]
del myList[3:]
print(myList)
b) myList = [1,2,3,4,5,6,7,8,9,10]
del myList[:5]
print(myList)
c) myList = [1,2,3,4,5,6,7,8,9,10]
del myList[::2]
print(myList)
5. Differentiate between append() and extend() methods
of list.
Programming Problems
1. Write a program to find the number of times an element
occurs in the list.
Case Study
1. A bank is a financial institution which is involved in
borrowing and lending of money. With advancement
in technology, online banking, also known as internet
banking allows customers of a bank to conduct a range
of financial transactions through the bank’s website
anytime, anywhere. As part of initial investigation you
are suggested to:
• Collect a Bank’s application form. After careful
analysis of the form, identify the information
required for opening a savings account. Also
enquire about the rate of interest offered for a
savings account.
• The basic two operations performed on an account
are Deposit and Withdrawal. Write a menu driven
program that accepts either of the two choices
of Deposit and Withdrawal, then accepts an
amount, performs the transaction and accordingly
displays the balance. Remember every bank has
a requirement of minimum balance which needs
to be taken care of during withdrawal operations.
Understanding Chapter
Data 5
In this chapter
»» Introduction to Data
»» Data Collection
“Data is not information, Information »» Data Storage
Understanding Data 83
Understanding Data 85
Understanding Data 87
A website handling online filling of student details for a competitive examination and generating admit card
ATM PIN number, account type, Checking for valid PIN number,
account number, card number, existing bank balance, if satisfied, Currency notes, printed slip with
ATM location from where money then deduction of amount from that transaction details
was withdrawn, date and time, and account and counting of rupees and
amount to be withdrawn. initiate printing of receipt
Journey start and end stations, Verify login details and check
date of journey, number of tickets availability of berth in that class. If
required, class of travel payment done, issue tickets and Generate ticket with berth and
(Sleeper/AC/other), berth deduct that number from the total coach number, or issue ticket with
preference (if any), passenger available tickets on that coach. a waiting list number
name(s) and age(s), mobile and Allocate PNR number and berths or
email id, payment related details, generate a waiting number for that
etc. ticket.
computed as ∑ xi .
i
n
Example 5.1
Assume that height (in cm) of students in a class are as
follows [90,102,110,115,85,90,100,110,110]. Mean or
average height of the class is
90 + 102 + 110 + 115 + 85 + 90 + 100 + 110 + 110 912
= = 101.33 cm
9 9
Understanding Data 89
(B) Median
Median is also computed for a single attribute/variable
at a time. When all the values are sorted in ascending or
descending order, the middle value is called the Median.
When there are odd number of values, then median is
the value at the middle position. If the list has even
number of values, then median is the average of the two
middle values. Median represents the central value at
which the given data is equally divided into two parts.
Example 5.2
Consider the previous data of height of students used
in calculation of mean value. In order to compute the
median, the first step is to sort data in ascending or
descending order. We have sorted the height data in
ascending order as [85,90,90,100,102,110,110,110,
115]. As there are total 9 values (odd number), the Think and Reflect
Out of Mean and Median,
median is the value at position 5, that is 102 cm,
which one is more
whether counted from left to right or from right to left. sensitive to outliers in
Median represents the actual central value at which the data?
given data is equally divided into two parts.
(C) Mode
Value that appears most number of times in the given
data of an attribute/variable is called Mode. It is
computed on the basis of frequency of occurrence of
distinct values in the given data. A data set has no mode
if each value occurs only once. There may be multiple
modes in the data if more than one values have same
highest frequency. Mode can be found for numeric as
well as non-numeric data.
Example 5.3
In the list of height of students, mode is 110 as its
frequency of occurrence in the list is 3, which is larger
than the frequency of rest of the values.
5.5.2 Measures of Variability
The measures of variability refer to the spread or variation
of the values around the mean. They are also called
measures of dispersion that indicate the degree of diversity
in a data set. They also indicate difference within the group.
Two different data sets can have the same mean, median
or mode but completely different levels of dispersion, or
vice versa. Common measures of dispersion or variability
are Range and Standard Deviation.
Example 5.5
Let us compute the standard deviation of the hight
of nine students that we used while calculating
Mean. The Mean (x) was calculated to be 101.33 cm.
Subtract each value from the mean and take square
of that value. Dividing the sum of square values by
total number of values and taking its square not
gives the standard deviation in data. See Table 5.3
for details.
Understanding Data 91
n
110 8.67 75.17
n=9 _ _ = 104.22 = 10.2 cm
_ ∑x-x) = 0.03 ∑x-x)2 = 938.00
x =101.33
Teacher wants to know about the average performance of the whole class in
a test.
Find the popular color for car after surveying the car owners of a small city.
Notes Summary
• Data refer to unorganised facts that can be processed
to generate meaningful result or information.
• Data can be structured or unstructured.
• Hard Disk, SSD, CD/DVD, Pen Drive, Memory
Card, etc. are some of the commonly used storage
devices.
• Data Processing cycle involves input and storage
of data, its processing and generating output.
• Summarizing data using statistical techniques aids
in revealing data characteristics.
• Mean, Median, Mode, Range, and Standard
Deviation are some of the statistical techniques
used for data summarisation.
• Mean is the average of given values.
• Median is the mid value when data are sorted in
ascending/descending order.
• Mode is the data value that appears most number
of times.
• Range is the difference between the maximum and
minimum values.
• Standard deviation is the positive square root of
the average of squared difference of each value
from the mean.
Excercise
1. Identify data required to be maintained to perform the
following services:
a) Declare exam results and print e-certificates
b) Register participants in an exhibition and issue
biometric ID cards
c) To search for an image by a search engine
d) To book an OPD appointment with a hospital in a
specific department
2. A school having 500 students wants to identify
beneficiaries of the merit-cum means scholarship,
achieving more than 75% for two consecutive years
and having family income less than 5 lakh per annum.
Understanding Data 93
Introduction Chapter
to NumPy 6
In this chapter
»» Introduction
»» Array
»» NumPy Array
“The goal is to turn data into information, »» Indexing and Slicing
and information into insight.” »» Operations on Arrays
»» Concatenating Arrays
— Carly Fiorina »» Reshaping Arrays
»» Splitting Arrays
»» Statistical Operations
on Arrays
»» Loading Arrays from
Files
»» Saving NumPy Arrays
6.1 Introduction in Files on Disk
NumPy stands for ‘Numerical Python’. It is a
package for data analysis and scientific computing
with Python. NumPy uses a multidimensional
array object, and has functions and tools
for working with these arrays. The powerful
n-dimensional array in NumPy speeds-up data
processing. NumPy can be easily interfaced with
other Python packages and provides tools for
integrating with other programming languages
like C, C++ etc.
Installing NumPy
NumPy can be installed by typing following command:
pip install NumPy
6.2 Array
We have learnt about various data types like list, tuple,
Contiguous memory and dictionary. In this chapter we will discuss another
allocation:
datatype ‘Array’. An array is a data type used to store
The memory space
must be divided multiple values using a single identifier (variable name).
into the fined sized An array contains an ordered collection of data elements
position and each where each element is of the same type and can be
position is allocated referenced by its index (position).
to a single data only.
The important characteristics of an array are:
Now Contiguous
Memory Allocation: • Each element of the array is of same data
Divide the data into type, though the values stored in them may be
several blocks and different.
place in different
parts of the memory
• The entire array is stored contiguously in
according to the memory. This makes operations on array fast.
availability of memory • Each element of the array is identified or
space. referred using the name of the Array along with
the index of that element, which is unique for
each element. The index of an element is an
integral value associated with the element,
based on the element’s position in the array.
For example consider an array with 5 numbers:
[ 10, 9, 99, 71, 90 ]
Here, the 1st value in the array is 10 and has the
index value [0] associated with it; the 2nd value in the
array is 9 and has the index value [1] associated with
it, and so on. The last value (in this case the 5th value)
in this array has an index [4]. This is called zero based
indexing. This is very similar to the indexing of lists in
Python. The idea of arrays is so important that almost
all programming languages support it in one form or
another.
Introduction to NumPy 97
Introduction to NumPy 99
Vedika 76 75 47
Harun 84 59 60
Prasad 67 72 54
6.4.2 Slicing
Sometimes we need to extract part of an array. This is
done through slicing. We can define which part of the
array to be sliced by specifying the start and end index
values using [start : end] along with the array name.
Example 6.7
>>> array8
array([-2, 2, 6, 10, 14, 18, 22])
Notes Now let us see how slicing is done for 2-D arrays.
For this, let us create a 2-D array called array9 having
3 rows and 4 columns.
>>> array9 = np.array([[ -7, 0, 10, 20],
[ -5, 1, 40, 200],
[ -1, 1, 4, 30]])
#Subtraction
>>> array1 - array2
array([[ -7, -14],
[-11, -10]])
#Multiplication
>>> array1 * array2
array([[ 30, 120],
[ 60, 24]])
#Matrix Multiplication
>>> array1 @ array2
array([[120, 132],
[ 70, 104]])
#Exponentiation
>>> array1 ** 3
array([[ 27, 216],
[ 64, 8]], dtype=int32)
#Division
>>> array2 / array1
array([[3.33333333, 3.33333333],
[3.75 , 6. ]])
6.5.3 Sorting
Sorting is to arrange the elements of an array in
hierarchical order either ascending or descending. By
default, numpy does sorting in ascending order.
>>> array4 = np.array([1,0,2,-3,6,8,4,7])
>>> array4.sort()
>>> array4
array([-3, 0, 1, 2, 4, 6, 7, 8])
In 2-D array, sorting can be done along either of the
axes i.e., row-wise or column-wise. By default, sorting
is done row-wise (i.e., on axis = 1). It means to arrange
elements in each row in ascending order. When axis=0,
sorting is done column-wise, which means each column
is sorted in ascending order.
>>> array4 = np.array([[10,-7,0, 20],
[-5,1,200,40],[30,1,-1,4]])
>>> array4
array([[ 10, -7, 0, 20],
[ -5, 1, 200, 40],
[ 30, 1, -1, 4]])
>>> array1
array([[ 10, 20],
[-30, 40]])
>>> array2
array([[0, 0, 0],
[0, 0, 0]])
>>> array1.shape
(2, 2)
>>> array2.shape
(2, 3)
>>> array3.reshape(2,6)
array([[10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21]])
>>> secondc
array([[-7],
[ 1],
[ 1],
[ 2],
[ 1]])
>>> thirdc
array([[ 0, 20],
[200, 40],
[ -1, 4],
[ 0, 4],
[ 0, 2]])
>>> secondhalf
array([[ 0, 20],
[200, 40],
[ -1, 4],
[ 0, 4],
[ 0, 2]])
>>> arrayB.std(axis=0)
array([0.5, 2. ])
>>> arrayB.std(axis=1)
array([1.5, 1. ])
>>> studentdata
array([[ 1, 36, 18, 57],
[ 2, 22, 23, 45],
[ 3, 43, 51, 37],
[ 4, 41, 40, 60],
[ 5, 13, 18, 27]])
In the above statement, first we specify the name
and path of the text file containing the data. Let us
understand some of the parameters that we pass in the
np.loadtext() function:
>>> mks1
array([36, 22, 43, 41, 13])
>>> mks2
array([18, 23, 51, 40, 18])
>>> mks3
array([57, 45, 37, 60, 27])
>>> dataarray
array([[ 1., 36., 18., 57.],
[ 2., nan, 23., 45.],
[ 3., 43., 51., nan],
[ 4., 41., 40., 60.],
[ 5., 13., 18., 27.]])
The genfromtxt() function converts missing values
and character strings in numeric columns to nan. But if
we specify dtype as int, it converts the missing or other
non numeric values to -1. We can also convert these
missing values and character strings in the data files
to some specific value using the parameter filling_
values.
Example 6.10 Let us set the value of the missing or non
numeric data to -999:
Activity 6.2
>>> dataarray = np.genfromtxt('C:/NCERT/
Can you create a
dataMissing.txt',skip_header=1,
datafile and import
delimiter=',', filling_values=-999, data into multiple
dtype = int) NumPy arrays column
wise? (Hint: use unpack
>>> dataarray parameter)
array([[ 1, 36, 18, 57],
[ 2, -999, 23, 45],
[ 3, 43, 51, -999],
[ 4, 41, 40, 60],
[ 5, 13, 18, 27]])
Summary
• Array is a data type that holds objects of same
datatype (numeric, textual, etc.). The elements of
an array are stored contiguously in memory. Each
element of an array has an index or position value.
• NumPy is a Python library for scientific computing
which stores data in a powerful n-dimensional
ndarray object for faster calculations.
• Each element of an array is referenced by the array
name along with the index of that element.
• numpy.array() is a function that returns an object
of type numpy.ndarray.
• All arithmetic operations can be performed on
arrays when shape of the two arrays is same.
• NumPy arrays are not expandable or extendable.
Once a numpy array is defined, the space it occupies
in memory is fixed and cannot be changed.
• numpy.split() slices apart an array into multiple
sub-arrays along an axis.
• numpy.concatenate() function can be used to
concatenate arrays.
• numpy.loadtxt() and numpy.genfromtxt() are
functions used to load data from files. The savetxt()
function is used to save a NumPy array to a
text file.
Notes
Exercise
1. What is NumPy ? How to install it?
2. What is an array and how is it different from a list? What
is the name of the built-in array class in NumPy ?
3. What do you understand by rank of an ndarray?
4. Create the following NumPy arrays:
a) A 1-D array called zeros having 10 elements and
all the elements are set to zero.
b) A 1-D array called vowels having the elements ‘a’,
‘e’, ‘i’, ‘o’ and ‘u’.
c) A 2-D array called ones having 2 rows and 5
columns and all the elements are set to 1 and
dtype as int.
d) Use nested Python lists to create a 2-D array called
myarray1 having 3 rows and 3 columns and store
the following data:
2.7, -2, -19
0, 3.4, 99.9
10.6, 0, 13
e) A 2-D array called myarray2 using arange()
having 3 rows and 5 columns with start value = 4,
step size 4 and dtype as float.
5. Using the arrays created in Question 4 above, write
NumPy commands for the following:
a) Find the dimensions, shape, size, data type of the
items and itemsize of arrays zeros, vowels,
ones, myarray1 and myarray2.
b) Reshape the array ones to have all the 10 elements
in a single row.
c) Display the 2nd and 3rd element of the array vowels.
d) Display all elements in the 2nd and 3rd row of the
array myarray1.
e) Display the elements in the 1st and 2nd column of
the array myarray1.
f) Display the elements in the 1st column of the 2nd
and 3rd row of the array myarray1.
g) Reverse the array of vowels.
6. Using the arrays created in Question 4 above, write
NumPy commands for the following:
You may type this using any text editor (Notepad, gEdit
or any other) in the way as shown below and store the
file with a name called Iris.txt. (In case you wish to work
with the entire dataset you could download a .csv file for the
same from the Internet and save it as Iris.txt). The
headers are:
sepal length, sepal width, petal length, petal width, iris,
Species No
5.1, 3.5, 1.4, 0.2, Iris-setosa, 1
4.9, 3, 1.4, 0.2, Iris-setosa, 1
4.7, 3.2, 1.3, 0.2, Iris-setosa, 1
4.6, 3.1, 1.5, 0.2, Iris-setosa, 1
5, 3.6, 1.4, 0.2, Iris-setosa, 1
5.4, 3.9, 1.7, 0.4, Iris-setosa, 1
4.6, 3.4, 1.4, 0.3, Iris-setosa, 1
5, 3.4, 1.5, 0.2, Iris-setosa, 1
4.4, 2.9, 1.4, 0.2, Iris-setosa, 1
4.9, 3.1, 1.5, 0.1, Iris-setosa, 1
Notes 9. Similarly find the max, min, mean and standard deviation
for the columns of the iris1, iris2 and iris3 and
store the results in the arrays with appropriate names.
10. Check the minimum value for sepal length, sepal width,
petal length and petal width of the three species in
comparison to the minimum value of sepal length, sepal
width, petal length and petal width for the data set as a
whole and fill the table below with True if the species value
is greater than the dataset value and False otherwise.
sepal length
sepal width
petal length
petal width
# Solution to Q1
>>> iris = np.genfromtxt('C:/NCERT/Iris.txt',skip_
header=1, delimiter=',', dtype = float)
# Solution to Q2
>>> iris = iris[0:30,[0,1,2,3,5]] # drop column 4
# Solution to Q3
>>> iris.shape
(30, 5)
>>> iris.ndim
2 Notes
>>> iris.size
150
# Solution to Q4
# Split into three arrays, each array for a different
# species
>>> iris1, iris2, iris3 = np.split(iris, [10,20],
axis=0)
# Solution to Q5
# Print the three arrays
>>> iris1
array([[5.1, 3.5, 1.4, 0.2, 1. ],
[4.9, 3. , 1.4, 0.2, 1. ],
[4.7, 3.2, 1.3, 0.2, 1. ],
[4.6, 3.1, 1.5, 0.2, 1. ],
[5. , 3.6, 1.4, 0.2, 1. ],
[5.4, 3.9, 1.7, 0.4, 1. ],
[4.6, 3.4, 1.4, 0.3, 1. ],
[5. , 3.4, 1.5, 0.2, 1. ],
[4.4, 2.9, 1.4, 0.2, 1. ],
[4.9, 3.1, 1.5, 0.1, 1. ]])
>>> iris2
array([[5.5, 2.6, 4.4, 1.2, 2. ],
[6.1, 3. , 4.6, 1.4, 2. ],
[5.8, 2.6, 4. , 1.2, 2. ],
[5. , 2.3, 3.3, 1. , 2. ],
[5.6, 2.7, 4.2, 1.3, 2. ],
[5.7, 3. , 4.2, 1.2, 2. ],
[5.7, 2.9, 4.2, 1.3, 2. ],
[6.2, 2.9, 4.3, 1.3, 2. ],
[5.1, 2.5, 3. , 1.1, 2. ],
[5.7, 2.8, 4.1, 1.3, 2. ]])
>>> iris3
array([[6.9, 3.1, 5.4, 2.1, 3. ],
[6.7, 3.1, 5.6, 2.4, 3. ],
[6.9, 3.1, 5.1, 2.3, 3. ],
[5.8, 2.7, 5.1, 1.9, 3. ],
[6.8, 3.2, 5.9, 2.3, 3. ],
[6.7, 3.3, 5.7, 2.5, 3. ],
[6.7, 3. , 5.2, 2.3, 3. ],
[6.3, 2.5, 5. , 1.9, 3. ],
[6.5, 3. , 5.2, 2. , 3. ],
[6.2, 3.4, 5.4, 2.3, 3. ]])
Notes # Solution to Q6
>>> header =np.array(["sepal length", "sepal
width", "petal length", "petal width",
"Species No"])
# Solution to Q7
>>> print(header)
['sepal length' 'sepal width' 'petal length' 'petal
width' 'Species No']
# Solution to Q8
# Stats for array iris
# Finds the max of the data for sepal length, sepal
width, petal length, petal width, Species No
>>> iris_max = iris.max(axis=0)
>>> iris_max
array([6.9, 3.9, 5.9, 2.5, 3. ])
# Solution to Q9
>>> iris1_max = iris1.max(axis=0)
>>> iris1_max
array([5.4, 3.9, 1.7, 0.4, 1. ])
# Solution to Q11
#Compare Iris setosa and Iris virginica
>>> iris1_avg[1] > iris2_avg[1] #sepal width
True
# Solution to Q12
>>> iris1_avg[2] > iris2_avg[2] #petal length
False
# Solution to Q13
>>> iris1_avg[3] > iris2_avg[3] #petal width
False
# Solution to Q14
>>> np.savetxt('C:/NCERT/IrisMeanValues.txt',
iris_avg, delimiter = ',')
# Solution to Q15
>>> np.savetxt('C:/NCERT/IrisStat.txt', (iris_
max, iris_avg, iris_min), delimiter=',')
Database Chapter
Concepts 7
In this chapter
»» Introduction
»» File System
“Inconsistency of your mind… Can damage »» Database Management
System
your memory… Remove the inconsistent
»» Relational Data Model
data… And keep the original one !!!” »» Keys in a Relational
Database
— Nisarga Jain
7.1 Introduction
After learning about importance of data in the
previous chapter, we need to explore the methods
to store and manage data electronically. Let us
take an example of a school that maintains data
about its students, along with their attendance
record and guardian details.
The class teacher marks daily attendance of the
students in the attendance register. The teacher
records ‘P’ for present or ‘A’ for absent against
each student’s roll number on each working day.
If class strength is 50 and total working days in
ry
Query Result
Qu
Query Result
e
Qu
er
y
Student
Database
Guardian Catalog
Attendance
Table 7.7 Relation schemas along with its description of Student Attendance
database
Relation Scheme Description of attributes
STUDENT(RollNumber, RollNumber: unique id of the student
SName, SDateofBirth, SName: name of the student
GUID) SDateofBirth: date of birth of the student
GUID: unique id of the guardian of the student
ATTENDANCE AttendanceDate: date on which attendance is taken
(AttendanceDate, RollNumber: roll number of the student
RollNumber, AttendanceStatus: whether present (P) or absent(A)
AttendanceStatus) Note that combination of AttendanceDate and RollNumber will be unique
in each record of the table
GUARDIAN(GUID, GUID: unique id of the guardian
GName, GPhone, GName: name of the guardian
GAddress) GPhone: contact number of the guardian
GAddress: contact address of the guardian
Relation
State
101010101010 Himanshu Shah 9818184855 26/77, West Patel Nagar, Ahmedabad
333333333333 Danny Dsouza S -13, Ashok Village, Daman
466444444666 Sujata P. 7802983674 HNO-13, B- block, Preet Vihar, Madurai
Figure 7.2: StudentAttendance Database with the Primary and Foreign keys
Summary
• A file in a file system is a container to store data in a
computer.
• File system suffers from Data Redundancy, Data
Inconsistency, Data Isolation, Data Dependence and
Controlled Data sharing.
• Database Management System (DBMS) is a software
to create and manage databases. A database is a
collection of tables.
• Database schema is the design of a database
• A database constraint is a restriction on the type of
data that that can be inserted into the table.
• Database schema and database constraints are stored
in database Catalog.
Exercise
1. Give the terms for each of the following:
a) Collection of logically related records.
b) DBMS creates a file that contains description about the
data stored in the database.
c) Attribute that can uniquely identify the tuples in a
relation.
d) Special value that is stored when actual data value is
unknown for an attribute.
e) An attribute which can uniquely identify tuples of the
table but is not defined as primary key of the table.
f) Software that is used to create, manipulate and maintain
a relational database.
2. Why foreign keys are allowed to have NULL values? Explain
with an example.
Table: STUDENT
Roll No Name Class Section Registration_ID
11 Mohan XI 1 IP-101-15
12 Sohan XI 2 IP-104-15
21 John XII 1 CS-103-14
Table: PROJECT ASSIGNED
22 Meena XII 2 CS-101-14
Registration_ID ProjectNo
23 Juhi XII 2 CS-101-10
IP-101-15 101
Table: PROJECT IP-104-15 103
ProjectNo PName SubmissionDate CS-103-14 102
101 Airline Database 12/01/2018 CS-101-14 105
102 Library Database 12/01/2018 CS-101-10 104
103 Employee Database 15/01/2018
104 Student Database 12/01/2018
105 Inventory Database 15/01/2018
106 Railway Database 15/01/2018
Introduction to Chapter
Structured Query 8
Language (SQL)
In this chapter
»» Introduction
»» Structured Query
“The most important motivation for the Language (SQL)
research work that resulted in the relational »» Data Types and
model was the objective of providing a sharp Constraints in MySQL
and clear boundary between the logical and »» SQL for Data Definition
physical aspects of database management.” »» SQL for Data
Manipulation
– E. F. Codd »» SQL for Data Query
»» Data Updation and
Deletion
8.1 Introduction
We have learnt about Relational Database
Management System (RDBMS) and purpose in the
previous chapter. There are many RDBMS such
as MySQL, Microsoft SQL Server, PostgreSQL,
Oracle, etc. that allow us to create a database
consisting of relations and to link one or more
relations for efficient querying to store, retrieve
and manipulate data on that database. In this
chapter, we will learn how to create, populate and
query database using MySQL.
INT INT specifies an integer value. Each INT value occupies 4 bytes of storage. The
range of values allowed in integer type are -2147483648 to 2147483647. For
values larger than that, we have to use BIGINT, which occupies 8 bytes.
FLOAT Holds numbers with decimal points. Each FLOAT value occupies 4 bytes.
DATE The DATE type is used for dates in 'YYYY-MM-DD' format. YYYY is the 4 digit
year, MM is the 2 digit month and DD is the 2 digit date. The supported range
is '1000-01-01' to '9999-12-31'.
8.3.2 Constraints
Think and Reflect
Which two constraints
Constraints are certain types of restrictions on the data
when applied together values that an attribute can have. They are used to
will produce a Primary ensure the accuracy and reliability of data. However, it
Key constraint? is not mandatory to define constraint for each attribute
of a table. Table 8.2 lists various SQL constraints.
Table 8.2 Commonly used SQL Constraints
Constraint Description
NOT NULL Ensures that a column cannot have NULL values where NULL means missing/
unknown/not applicable value.
UNIQUE Ensures that all the values in a column are distinct/unique.
DEFAULT A default value specified for the column if no value is provided.
PRIMARY KEY The column which can uniquely identify each row or record in a table.
FOREIGN KEY The column which refers to value of an attribute defined as primary key in another
table.
Table 8.4 Data types and constraints for the attributes of relation GUARDIAN
Attribute Name Data expected to be stored Data type Constraint
GUID Numeric value consisting of 12 digit Aadhaar CHAR (12) PRIMARY KEY
number
GName Variant length string of maximum 20 VARCHAR(20) NOT NULL
characters
GPhone Numeric value consisting of 10 digits CHAR(10) NULL UNIQUE
GAddress Variant length string of size 30 characters VARCHAR(30) NOT NULL
Table 8.5 Data types and constraints for the attributes of relation ATTENDANCE.
Attribute Name Data expected to be stored Data type Constraint
AttendanceDate Date value DATE PRIMARY KEY*
RollNumber Numeric value consisting of maximum 3 INT PRIMARY KEY*
digits FOREIGN KEY
AttendanceStatus ‘P’ for present and ‘A’ for absent CHAR(1) NOT NULL
*means part of composite primary key
Cautions:
1) Using the Drop statement to remove a database will
ultimately remove all the tables within it.
2) DROP statement will remove the tables or database
created by you. Hence you may apply DROP statement at
the end of the chapter.
Notes Note:
i) Annual Salary will not be added as a new column in the
database table. It is just for displaying the output of the
query.
ii) If an aliased column name has space as in the case of Annual
Salary, it should be enclosed in quotes as 'Annual Salary'.
(C) DISTINCT Clause
By default, SQL shows all the data retrieved through
query as output. However, there can be duplicate values.
The SELECT statement when combined with DISTINCT
clause, returns records without repetition (distinct
records). For example, while retrieving employee’s
department number, there can be duplicate values as
many employees are assigned to same department. To
display unique department number for all the employees,
we use DISTINCT as shown below:
mysql> SELECT DISTINCT DeptId
-> FROM EMPLOYEE;
+--------+
| DeptId |
+--------+
| D02 |
| D01 |
| D04 |
| D03 |
| D05 |
+--------+
5 rows in set (0.03 sec)
(D) WHERE Clause
The WHERE clause is used to retrieve data that meet
some specified conditions. In the OFFICE database,
more than one employee can have the same salary. To
display distinct salaries of the employees working in the
department number D01, we write the following query
in which the condition to select the employee whose
department number is D01 is specified using the WHERE
clause:
mysql> SELECT DISTINCT Salary
-> FROM EMPLOYEE
-> WHERE Deptid='D01';
As the column DeptId is of string type, its values are
enclosed in quotes ('D01').
+--------+
| Salary |
+--------+
| 60000 |
| 45000 |
| 15000 |
+--------+
3 rows in set (0.02 sec)
Example 8.5 The following query displays records of all the Think and Reflect
employees except Aaliya. What will happen if
mysql> SELECT * in the above query
-> FROM EMPLOYEE we write “Aaliya” as
-> WHERE NOT Ename = 'Aaliya'; “AALIYA” or “aaliya”
+-------+----------+--------+-------+--------+ or “AaLIYA”? Will the
| EmpNo | Ename | Salary | Bonus | DeptId | query generate the same
+-------+----------+--------+-------+--------+ output or an error?
| 102 | Kritika | 60000 | 123 | D01 |
| 103 | Shabbir | 45000 | 566 | D01 |
| 104 | Gurpreet | 19000 | 565 | D04 |
| 105 | Joseph | 34000 | 875 | D03 |
| 106 | Sanya | 48000 | 695 | D02 |
| 107 | Vergese | 15000 | NULL | D01 |
| 108 | Nachaobi | 29000 | NULL | D05 |
| 109 | Daribha | 42000 | NULL | D04 |
| 110 | Tanya | 50000 | 467 | D05 |
+-------+----------+--------+-------+--------+
9 rows in set (0.00 sec)
Example 8.6 The following query displays name and
department number of all those employees who are earning Activity 8.8
salary between 20000 and 50000 (both values inclusive). Compare the output
mysql> SELECT Ename, DeptId produced by the query
-> FROM EMPLOYEE in example 8.6 and
-> WHERE Salary>=20000 AND Salary<=50000; the following query
+----------+--------+ and differentiate
| Ename | DeptId | between the OR and AND
+----------+--------+ operators.
| Shabbir | D01 | SELECT *
| Joseph | D03 | FROM EMPLOYEE
| Sanya | D02 | WHERE Salary > 5000 OR
| Nachaobi | D05 | DeptId= 20;
| Daribha | D04 |
| Tanya | D05 |
+----------+--------+
6 rows in set (0.00 sec)
Example 8.8 The following query displays details of all the Notes
employees except those working in department number D01
or D02.
mysql> SELECT *
-> FROM EMPLOYEE
-> WHERE DeptId NOT IN('D01', 'D02');
+-------+----------+--------+-------+--------+
| EmpNo | Ename | Salary | Bonus | DeptId |
+-------+----------+--------+-------+--------+
| 104 | Gurpreet | 19000 | 565 | D04 |
| 105 | Joseph | 34000 | 875 | D03 |
| 108 | Nachaobi | 29000 | NULL | D05 |
| 109 | Daribha | 42000 | NULL | D04 |
| 110 | Tanya | 50000 | 467 | D05 |
+-------+----------+--------+-------+--------+
5 rows in set (0.00 sec)
Note: Here we need to combine NOT with IN as we want to retrieve
all records except with DeptId D01 and D02.
(F) ORDER BY Clause
ORDER BY clause is used to display data in an ordered
(arranged) form with respect to a specified column. By
default, ORDER BY displays records in ascending order of
the specified column’s values. To display the records in
descending order, the DESC (means descending) keyword
needs to be written with that column.
Example 8.9 The following query displays details of all the
employees in ascending order of their salaries.
mysql> SELECT *
-> FROM EMPLOYEE
-> ORDER BY Salary;
+-------+----------+--------+-------+--------+
| EmpNo | Ename | Salary | Bonus | DeptId |
+-------+----------+--------+-------+--------+
| 101 | Aaliya | 10000 | 234 | D02 |
| 107 | Vergese | 15000 | NULL | D01 |
| 104 | Gurpreet | 19000 | 565 | D04 |
| 108 | Nachaobi | 29000 | NULL | D05 |
| 105 | Joseph | 34000 | 875 | D03 |
| 109 | Daribha | 42000 | NULL | D04 |
| 103 | Shabbir | 45000 | 566 | D01 |
| 106 | Sanya | 48000 | 695 | D02 |
| 110 | Tanya | 50000 | 467 | D05 |
| 102 | Kritika | 60000 | 123 | D01 |
+-------+----------+--------+-------+--------+
10 rows in set (0.05 sec)
+-------+----------+--------+-------+--------+
| EmpNo | Ename | Salary | Bonus | DeptId |
+-------+----------+--------+-------+--------+
| 102 | Kritika | 60000 | 123 | D01 |
| 110 | Tanya | 50000 | 467 | D05 |
| 106 | Sanya | 48000 | 695 | D02 |
| 103 | Shabbir | 45000 | 566 | D01 |
| 109 | Daribha | 42000 | NULL | D04 |
| 105 | Joseph | 34000 | 875 | D03 |
| 108 | Nachaobi | 29000 | NULL | D05 |
| 104 | Gurpreet | 19000 | 565 | D04 |
| 107 | Vergese | 15000 | NULL | D01 |
| 101 | Aaliya | 10000 | 234 | D02 |
+-------+----------+--------+-------+--------+
10 rows in set (0.00 sec)
(G) Handling NULL Values
SQL supports a special value called NULL to represent
a missing or unknown value. For example, the village
column in a table called address will have no value for
cities. Hence, NULL is used to represent such unknown
values. It is important to note that NULL is different
from 0 (zero). Also, any arithmetic operation performed
with NULL value gives NULL. For example: 5 + NULL =
NULL because NULL is unknown hence the result is also
unknown. In order to check for NULL value in a column,
Activity 8.9
we use IS NULL.
Execute the following
two queries and find Example 8.11 The following query displays details of all
out what will happen if those employees who have not been given a bonus. This
we specify two columns implies that the bonus column will be blank.
in the ORDER BY clause: mysql> SELECT *
-> FROM EMPLOYEE
SELECT *
-> WHERE Bonus IS NULL;
FROM EMPLOYEE
ORDER BY Salary, +-------+----------+--------+-------+--------+
Bonus; | EmpNo | Ename | Salary | Bonus | DeptId |
+-------+----------+--------+-------+--------+
SELECT *
| 107 | Vergese | 15000 | NULL | D01 |
| 108 | Nachaobi | 29000 | NULL | D05 |
FROM EMPLOYEE | 109 | Daribha | 42000 | NULL | D04 |
ORDER BY Salary,Bonus +-------+----------+--------+-------+--------+
desc; 3 rows in set (0.00 sec)
Example 8.12 The following query displays names of all the
employees who have been given a bonus. This implies that
the bonus column will not be blank.
mysql> SELECT EName
-> FROM EMPLOYEE
-> WHERE Bonus IS NOT NULL;
+----------+
| EName |
+----------+
| Aaliya |
| Kritika |
| Shabbir |
| Gurpreet |
| Joseph |
| Sanya |
| Tanya |
+----------+
7 rows in set (0.00 sec)
+-------+---------+--------+-------+--------+
| EmpNo | Ename | Salary | Bonus | DeptId |
+-------+---------+--------+-------+--------+
| 102 | Kritika | 60000 | 123 | D01 |
+-------+---------+--------+-------+--------+
1 row in set (0.00 sec)
WHERE condition;
The STUDENT Table 8.7 has NULL value for GUID
for student with roll number 3. Also, suppose students
with roll numbers 3 and 5 are siblings. So, in STUDENT
table, we need to fill the GUID value for student with
roll number 3 as 101010101010. In order to update or
change GUID of a particular row (record), we need to
specify that record using WHERE clause, as shown below:
mysql> UPDATE STUDENT
-> SET GUID = 101010101010
-> WHERE RollNumber = 3;
Query OK, 1 row affected (0.06 sec)
Rows matched: 1 Changed: 1 Warnings: 0
We can then verify the updated data using the
statement SELECT * FROM STUDENT.
Caution : If we miss the where clause in the UPDATE statement then
the GUID of all the records will be changed to 101010101010.
We can also update values for more than one column
using the UPDATE statement. Suppose, the guardian
(Table 8.6) with GUID 466444444666 has requested to
change the Address to 'WZ - 68, Azad Avenue, Bijnour,
MP' and Phone number to '9010810547'.
mysql> UPDATE GUARDIAN
-> SET GAddress = 'WZ - 68, Azad Avenue,
-> Bijnour, MP', GPhone = 9010810547
-> WHERE GUID = 466444444666;
Query OK, 1 row affected (0.06 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM GUARDIAN ;
+------------+---------------+----------+------------------------------------+
|GUID |GName |Gphone |GAddress |
+------------+---------------+----------+------------------------------------+
|444444444444|Amit Ahuja |5711492685|G-35, Ashok vihar, Delhi |
|111111111111|Baichung Bhutia|7110047139|Flat no. 5, Darjeeling Appt., Shimla|
|101010101010|Himanshu Shah |9818184855|26/77, West Patel Nagar, Ahmedabad |
|333333333333|Danny Dsouza |NULL |S -13, Ashok Village, Daman |
|466444444666|Sujata P. |9010810547|WZ - 68, Azad Avenue, Bijnour, MP |
+------------+---------------+----------+------------------------------------+
5 rows in set (0.00 sec)
Summary
• Database is a collection of related tables. MySQL is a
‘relational’ DBMS. A table is a collection of rows and
columns, where each row is a record and columns
describe the feature of records.
• SQL is the standard language for most RDBMS.
SQL is case insensitive.
• CREATE DATABASE statement is used to create a new
database.
• USE statement is used for making the specified
database as active database.
• CREATE TABLE statement is used to create a table.
• Every attribute in a CREATE TABLE statement must
have a name and a datatype.
• ALTER TABLE statement is used to make changes in
the structure of a table like adding, removing or
changing datatype of column(s).
• The DESC statement with table name shows the
structure of the table.
• INSERT INTO statement is used to insert record(s) in
a table.
• UPDATE statement is used to modify existing data in
a table.
• DELETE statement is used to delete records in a table.
Exercise
1. Match the following clauses with their respective
functions.
ALTER Insert the values in a table
UPDATE Restrictions on columns
DELETE Table definition
INSERT INTO Change the name of a column
CONSTRAINTS Update existing information in a table
DESC Delete an existing row from a table
CREATE Create a database
Table: PROJECT_ASSIGNED
RegistrationID ProjectID AssignDate
Table: PROJECT
ProjectID ProjectName SubmissionDate TeamSize GuideTeacher