DAF 1104 QM Teaching Notes Refined
DAF 1104 QM Teaching Notes Refined
Set theory is the mathematical theory of well-determined collections, called sets, of objects that
are called members, or elements, of the set. Pure set theory deals exclusively with sets, so the
only sets under consideration are those whose members are also sets.
It is natural for us to classify items into groups, or sets, and consider how those sets overlap with
each other. We can use these sets understand relationships between groups, and to analyze
survey data.
Any collection of items can form a set.
Set
A set is a collection of distinct objects, called elements of the set
A set can be defined by describing the contents, or by listing the elements of the set, enclosed in
curly brackets.
Example 1
Some examples of sets defined by describing the contents:
1. The set of all even numbers
2. The set of all books written about travel to Chile
Answers
Some examples of sets defined by listing the elements of the set:
1. {1, 3, 9, 12} by use of positive integers
2. {red, orange, yellow, green, blue, indigo, purple}
A set simply specifies the contents; order is not important. The set represented by {1, 2, 3} is
equivalent to the set {3, 1, 2}.
Types of Sets
The sets are further categorised into different types, based on elements or types of elements.
These different types of sets in basic set theory are:
i) Finite set: The number of elements is finite .A finite set in mathematics is a set that has a finite
number of elements. In simple words, it is a set that you can finish counting. For example,
{1,3,5,7} is a finite set with four elements. The element in the finite set is a natural number, i.e.
non-negative integer.
ii) Infinite set: The number of elements are infinite Infinite set: A set is said to be an infinite set
whose elements cannot be listed if it has an unlimited (i.e. uncountable) by the natural number 1,
2, 3, 4, ……
or a null set { }, An empty set is denoted using the symbol '∅'. It is read as 'phi'.
iii) Empty set: It has no elements A set that does not contain any element is called an empty set
iv) Singleton set: It has one only element A singleton set is a set containing exactly one element.
For example, {a}, {∅}, and {{a} }
v) Equal set: Two sets are equal if they have same elements. Two sets A and B can be equal
only on the condition that each element of set A is also the element of set B. Also, if two sets
happen to be the subsets of each other, then they are stated to be equal sets e.g A= {1, 3, 9, 5,
−7} and B = {5, −7, 3, 1, 9,}
vi)Equivalent set: Two sets are equivalent if they have same number of elements Equivalent
sets are sets that contain the same number of elements, although the elements themselves may be
different. For example, set A {5, 10, 15, 20} is equivalent to set B {w, x, y, z}. Each set contains
four elements
vii) Power set: A set of every possible subset. A power set is set of all subsets, empty set and the
original set itself. For example we get the Power Set of {a,b,c}:
P(S) = { {}, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} }
viii) Universal set: Any set that contains all the sets under consideration. For example
a)if a set A consists of elements like {2,6,9} and another set B consists of elements like {6,7,8}. Then we
can say that the universal set is a combination of both sets of elements.
ix)Subset: When all the elements of set A belong to set B, then A is subset of B A proper subset
is one that contains a few elements of the original set whereas an improper subset, contains every
element of the original set
x)A superset is defined as a set of another smaller set if almost all elements of that smaller set
are elements of the set. example A = {1, 2, 3} is a subset of B = {1, 2, 3, 4, 10} B is superset of
A
xi) The complement of set A is defined as a set that contains the elements present in the
universal set but not in set A. For example, Set U = {2,4,6,8,10,12} and set A = {4,6,8}, then the
complement of set A, A′ = {2,10,12}
Set Theory Symbols
There are several symbols that are adopted for common sets. They are given in the table below:
{} Set
A∪B A union B
A∩B A intersection B
A⊆B A is subset of B
A⊇B Superset
Ø empty set
Ac Complement of A
a∈B a element of B
The complement of a set A contains everything that is not in the set A. The complement is
notated A’, or Ac, or sometimes ~A.
Example 5
Consider the sets:
A = {red, green, blue}
B = {red, yellow, orange}
C = {red, orange, yellow, green, blue, purple}
Find the following:
1. Find A ⋃ B
2. Find A ⋂ B
3. Find Ac⋂ C
Answers
1. The union contains all the elements in either set: A ⋃ B = {red, green, blue, yellow,
orange} Notice we only list red once.
2. The intersection contains all the elements in both sets: A ⋂ B = {red}
3. Here we’re looking for all the elements that are not in set A and are also in C. Ac ⋂ C =
{orange, yellow, purple}
Try It Now
Using the sets from the previous example, find A ⋃ C and Bc ⋂ A
Notice that in the example above, it would be hard to just ask for Ac, since everything from the
color fuchsia to puppies and peanut butter are included in the complement of the set. For this
reason, complements are usually only used with intersections, or when we have a universal set in
place.
Universal Set
A universal set is a set that contains all the elements we are interested in. This would have to be
defined by the context.
A complement is relative to the universal set, so Ac contains all the elements in the universal set
that are not in A.
Example 6
1. If we were discussing searching for books, the universal set might be all the books in the
library.
2. If we were grouping your Facebook friends, the universal set would be all your Facebook
friends.
3. If you were working with sets of numbers, the universal set might be all whole numbers,
all integers, or all real numbers
Example 7
Suppose the universal set is U = all whole numbers from 1 to 9. If A = {1, 2, 4}, then Ac = {3, 5,
6, 7, 8, 9}.
As we saw earlier with the expression Ac ⋂ C, set operations can be grouped together. Grouping
symbols can be used like they are with arithmetic – to force an order of operations.
Example 8
Suppose H = {cat, dog, rabbit, mouse}, F = {dog, cow, duck, pig, rabbit}, and W = {duck,
rabbit, deer, frog, mouse}
1. Find (H ⋂ F) ⋃ W
2. Find H ⋂ (F ⋃ W)
3. Find (H ⋂ F)c ⋂ W
Solutions
1. We start with the intersection: H ⋂ F = {dog, rabbit}. Now we union that result with W:
(H ⋂ F) ⋃ W = {dog, duck, rabbit, deer, frog, mouse}
2. We start with the union: F ⋃ W = {dog, cow, rabbit, duck, pig, deer, frog, mouse}. Now
we intersect that result with H: H ⋂ (F ⋃ W) = {dog, rabbit, mouse}
3. We start with the intersection: H ⋂ F = {dog, rabbit}. Now we want to find the elements
of W that are not in H ⋂ F. (H ⋂ F)c ⋂ W = {duck, deer, frog, mouse}
Set Theory Formulas
n( A ∪ B ) = n(A) +n(B) – n (A ∪ B)
n(A∪B)=n(A)+n(B) {when A and B are disjoint sets}
Example a
1. Let A and B be two finite sets such that n(A) = 20, n(B) = 28 and n(A ∪ B) = 36, find n(A ∩
B).
Solution:
Using the formula n(A ∪ B) = n(A) + n(B) - n(A ∩ B).
then n(A ∩ B) = n(A) + n(B) - n(A ∪ B)
= 20 + 28 - 36
= 48 - 36
= 12
Example b
In a group of 100 persons, 72 people can speak English and 43 can speak French. How many can
speak English only? How many can speak French only and how many can speak both English
and French?
Solution:
Let A be the set of people who speak English.
B be the set of people who speak French.
A - B be the set of people who speak English and not French.
B - A be the set of people who speak French and not English.
A ∩ B be the set of people who speak both French and English.
Given,
n(A) = 72 n(B) = 43 n(A ∪ B) = 100
Now, n(A ∩ B) = n(A) + n(B) - n(A ∪ B)
= 72 + 43 - 100
= 115 - 100
= 15
Therefore, Number of persons who speak both French and English = 15
n(A) = n(A - B) + n(A ∩ B)
⇒ n(A - B) = n(A) - n(A ∩ B)
= 72 - 15
= 57
and n(B - A) = n(B) - n(A ∩ B)
= 43 - 15
= 28
Therefore, Number of people speaking English only = 57
Number of people speaking French only = 28
Word problems on sets using the different properties (Union & Intersection):
Example c
Each student in a class of 40 plays at least one indoor game chess, carrom and scrabble. 18 play
chess, 20 play scrabble and 27 play carrom. 7 play chess and scrabble, 12 play scrabble and
carrom and 4 play chess, carrom and scrabble. Find the number of students who play
(i) chess and carrom.
(ii) chess, carrom but not scrabble.
Solution:
Let A be the set of students who play chess
B be the set of students who play scrabble
C be the set of students who play carrom
Therefore, We are given n(A ∪ B ∪ C) = 40,
n(A) = 18, n(B) = 20 n(C) = 27,
n(A ∩ B) = 7, n(C ∩ B) = 12 n(A ∩ B ∩ C) = 4
We have
n(A ∪ B ∪ C) = n(A) + n(B) + n(C) - n(A ∩ B) - n(B ∩ C) - n(C ∩ A) + n(A ∩ B ∩ C)
Therefore, 40 = 18 + 20 + 27 - 7 - 12 - n(C ∩ A) + 4
40 = 69 – 19 - n(C ∩ A)
40 = 50 - n(C ∩ A) n(C ∩ A) = 50 - 40
n(C ∩ A) = 10
Therefore, Number of students who play chess and carrom are 10.
Also, number of students who play chess, carrom and not scrabble.
= n(C ∩ A) - n(A ∩ B ∩ C)
= 10 – 4
=6
Example d
What is the sum of all the positive multiples of 7 which are less than 100?
Venn Diagrams
To visualize the interaction of sets, John Venn in 1880 thought to use overlapping circles,
building on a similar idea used by Leonhard Euler in the eighteenth century. These illustrations
now called Venn Diagrams.
Venn Diagram
A Venn diagram represents each set by a circle, usually drawn inside of a containing box
representing the universal set. Overlapping areas indicate elements common to both sets.
Basic Venn diagrams can illustrate the interaction of two or three sets.
Example 9
Create Venn diagrams to illustrate A ⋃ B, A ⋂ B, and Ac ⋂ B
A ⋃ B contains all elements in either set.
A ⋂ B contains only those elements in both sets—in the overlap of the circles.
Ac will contain all elements not in the set A. Ac ⋂ B will contain the elements in set B that are
not in set A.
Example 10
Use a Venn diagram to illustrate (H ⋂ F)c ⋂ W
We’ll start by identifying everything in the set H ⋂ F
Now, (H ⋂ F)c ⋂ W will contain everything not in the set identified above that is also in set W.
Example 11
Create an expression to represent the outlined part of the Venn diagram shown.
this set as H ⋂ F ⋂ Wc
The elements in the outlined set are in sets H and F, but are not in set W. So we could represent
Try It Now
Create an expression to represent the outlined portion of the Venn diagram shown
Cardinality
Often times we are interested in the number of items in a set or subset. This is called the
cardinality of the set.
Cardinality
The number of elements in a set is the cardinality of that set.
The cardinality of the set A is often notated as |A| or n(A)
Example 12
Let A = {1, 2, 3, 4, 5, 6} and B = {2, 4, 6, 8}.
What is the cardinality of B? A ⋃ B, A ⋂ B?
Answers
The cardinality of B is 4, since there are 4 elements in the set.
The cardinality of A ⋃ B is 7, since A ⋃ B = {1, 2, 3, 4, 5, 6, 8}, which contains 7 elements.
The cardinality of A ⋂ B is 3, since A ⋂ B = {2, 4, 6}, which contains 3 elements.
Example 13
What is the cardinality of P = the set of English names for the months of the year?
Answers
The cardinality of this set is 12, since there are 12 months in the year.
Sometimes we may be interested in the cardinality of the union or intersection of sets, but not
know the actual elements of each set. This is common in surveying.
Example 14
A survey asks 200 people “What beverage do you drink in the morning”, and offers choices:
Tea only
Coffee only
Both coffee and tea
Suppose 20 report tea only, 80 report coffee only, 40 report both. How many people drink tea in
the morning? How many people drink neither tea or coffee?
Answers
cardinality of F ⋃ T is not simply 70% + 40%, since that would count those who use both
used Facebook. Notice that while the cardinality of F is 70% and the cardinality of T is 40%, the
services twice. To find the cardinality of F ⋃ T, we can add the cardinality of F and the
cardinality of T, then subtract those in intersection that we’ve counted twice. In symbols,
n(F ⋃ T) = n(F) + n(T) – n(F ⋂ T)
n(F ⋃ T) = 70% + 40% – 20% = 90%
(F ⋃ T)c . Since the universal set contains 100% of people and the cardinality of F ⋃ T = 90%,
Now, to find how many people have not used either service, we’re looking for the cardinality of
Question: In a class of 100 students, 35 like science and 45 like math. 10 like both. How many
like either of them and how many like neither?
Solution:
Total number of students, n(µ) = 100
Number of science students, n(S) = 35
Number of math students, n(M) = 45
Number of students who like both, n(M∩S) = 10
Number of students who like either of them,
n(MᴜS) = n(M) + n(S) – n(M∩S)
→ 45+35-10 = 70
Number of students who like neither = n(µ) – n(MᴜS) = 100 – 70 = 30
The easiest way to solve problems on sets is by drawing Venn diagrams, as shown below.
As it is said, one picture is worth a thousand words. One Venn diagram can help solve the
problem faster and save time. This is especially true when more than two categories are involved
in the problem.
Let us see some more solved examples.
Problem 1: There are 30 students in a class. Among them, 8 students are learning both English
and French. A total of 18 students are learning English. If every student is learning at least one
language, how many students are learning French in total?
Solution:
The Venn diagram for this problem looks like this.
Every student is learning at least one language. Hence there is no one who fall in the category
‘neither’.
So in this case, n(EᴜF) = n(µ).
It is mentioned in the problem that a total of 18 are learning English. This DOES NOT mean that
18 are learning ONLY English. Only when the word ‘only’ is mentioned in the problem should
we consider it so.
Now, 18 are learning English and 8 are learning both. This means that 18 – 8 = 10 are learning
ONLY English.
n(µ) = 30, n(E) = 10
n(EᴜF) = n(E) + n(F) – n(E∩F)
30 = 18+ n(F) – 8
n(F) = 20
Therefore, total number of students learning French = 20.
Note: The question was only about the total number of students learning French and not about
those learning ONLY French, which would have been a different answer, 12.
Finally, the Venn diagram looks like this.
Problem 2: Among a group of students, 50 played cricket, 50 played hockey and 40 played
volley ball. 15 played both cricket and hockey, 20 played both hockey and volley ball, 15 played
cricket and volley ball and 10 played all three. If every student played at least one game, find the
number of students and how many played only cricket, only hockey and only volley ball?
Solution:
n(C) = 50, n(H) = 50, n(V) = 40
n(C∩H) = 15
n(H∩V) = 20
n(C∩V) = 15
n(C∩H∩V) = 10
No. of students who played at least one game
n(CᴜHᴜV) = n(C) + n(H) + n(V) – n(C∩H) – n(H∩V) – n(C∩V) + n(C∩H∩V)
= 50 + 50 + 40 – 15 – 20 – 15 + 10
Total number of students = 100.
Let a denote the number of people who played cricket and volleyball only.
Let b denote the number of people who played cricket and hockey only.
Let c denote the number of people who played hockey and volleyball only.
Let d denote the number of people who played all three games.
Accordingly, d = n (CnHnV) = 10
Now, n(CnV) = a + d = 15
n(CnH) = b + d = 15
n(HnV) = c + d = 20
Therefore, a = 15 – 10 = 5 [cricket and volleyball only]
b = 15 – 10 = 5 [cricket and hockey only]
c = 20 – 10 = 10 [hockey and volleyball only]
No. of students who played only cricket = n(C) – [a + b + d] = 50 – (5 + 5 + 10) = 30
No. of students who played only hockey = n(H) – [b + c + d] = 50 – ( 5 + 10 + 10) = 25
No. of students who played only volley ball = n(V) – [a + c + d] = 40 – (10 + 5 + 10) = 15
As the Oxford dictionary states it, Probability means ‘The extent to which something is probable;
the likelihood of something happening or being the case’.
In mathematics too, probability indicates the same – the likelihood of the occurrence of an event.
Terms in Probability
The following terms in probability help in a better understanding of the concepts of probability.
PROBABILITY RULES
If A and B are mutually exclusive events, or those that cannot occur together, then the third
term is 0, and the rule reduces to P(A or B) = P(A) + P(B). For example, you can't flip a coin and
have it come up both heads and tails on one toss.
2.) The Multiplication Rule: P(A and B) = P(A) * P(B|A) or P(B) * P(A|B)
If A and B are independent events, we can reduce the formula to P(A and B) = P(A) * P(B).
The term independent refers to any event whose outcome is not affected by the outcome of
another event. For instance, consider the second of two coin flips, which still has a .50 (50%)
probability of landing heads, regardless of what came up on the first flip. What is the probability
that, during the two coin flips, you come up with tails on the first flip and heads on the second
flip?
Do you see why the complement rule can also be thought of as the subtraction rule? This rule
builds upon the mutually exclusive nature of P(A) and P(not A). These two events can never
occur together, but one of them always has to occur. Therefore P(A) + P(not A) = 1. For
example, if the weatherman says there is a 0.3 chance of rain tomorrow, what are the chances of
no rain?
Compound probability
Compound probability is when the problem statement asks for the likelihood of the occurrence of
more than one outcome.
P(A and B) is the probability of the occurrence of both A and B at the same time.
Mutually exclusive events are those where the occurrence of one indicates the non-occurrence of
the other
OR
When two events cannot occur at the same time, they are considered mutually exclusive.
Solution:
Taking the individual probabilities of each number, getting a 2 is 1/6 and so is getting a 5.
Example 2: Consider the example of finding the probability of selecting a black card or a 6 from
a deck of 52 cards.
Solution:
= 28/52
= 7/13.
Independent Event
When multiple events occur, if the outcome of one event DOES NOT affect the outcome of the
other events, they are called independent events.
Say, a die is rolled twice. The outcome of the first roll doesn’t affect the second outcome. These
two are independent events.
Example 1: Say, a coin is tossed twice. What is the probability of getting two consecutive tails ?
Probability of getting a tail in one toss = 1/2
Here’s the verification of the above answer with the help of sample space.
When a coin is tossed twice, the sample space is {(H,H), (H,T), (T,H), (T,T)}.
Our desired event is (T,T) whose occurrence is only once out of four possible outcomes and
hence, our answer is 1/4.
Example 2: Consider another example where a pack contains 4 blue, 2 red and 3 black pens. If a
pen is drawn at random from the pack, replaced and the process repeated 2 more times, What is
the probability of drawing 2 blue pens and 1 black pen?
Solution
Dependent Events
When two events occur, if the outcome of one event affects the outcome of the other, they are
called dependent events.
Consider the aforementioned example of drawing a pen from a pack, with a slight difference.
Example 1: A pack contains 4 blue, 2 red and 3 black pens. If 2 pens are drawn at random from
the pack, NOT replaced and then another pen is drawn. What is the probability of drawing 2 blue
pens and 1 black pen?
Solution:
Example 2: What is the probability of drawing a king and a queen consecutively from a deck of
52 cards, without replacement.
Now, the probability of drawing a king and queen consecutively is 1/13 * 4/51 = 4/663
Conditional probability
Conditional probability is calculating the probability of an event given that another event has
already occured .
Example: In a class, 40% of the students study math and science. 60% of the students study
math. What is the probability of a student studying science given he/she is already studying
math?
Solution
P(M) = 0.60
Complement of an event
A complement of an event A can be stated as that which does NOT contain the occurrence of A.
A complement of an event is denoted as P(Ac) or P(A’).
P(Ac) = 1 – P(A)
or it can be stated, P(A)+P(Ac) = 1
For example,
if A is the event of getting a head in coin toss, Ac is not getting a head i.e., getting a tail.
if A is the event of getting an even number in a die roll, Ac is the event of NOT getting an even
number i.e., getting an odd number.
Example: A single coin is tossed 5 times. What is the probability of getting at least one head?
Solution:
Probability Example 1
What is the probability of the occurrence of a number that is odd or less than 5 when a fair die is
rolled.
Solution
Let the event of the occurrence of a number that is odd be ‘A’ and the event of the occurrence of
a number that is less than 5 be ‘B’. We need to find P(A or B).
P(A and B) = 2/6 (numbers that are both odd and less than 5 = 1 and 3)
P(A or B) = 5/6.
Probability Example 2
A box contains 4 chocobars and 4 ice creams. Tom eats 3 of them one after another. What is the
probability of sequentially choosing 2 chocobars and 1 icecream?
Solution
So the final probability of choosing 2 chocobars and 1 icecream = 1/2 * 3/7 * 2/3 = 1/7
Probability Example 3
When two dice are rolled, find the probability of getting a greater number on the first die than the
one on the second, given that the sum should equal 8.
Solution
There are 5 ways to get a sum of 8 when two dice are rolled = {(2,6),(3,5),(4,4), (5,3),(6,2)}.
And there are two ways where the number on the first die is greater than the one on the second
given that the sum should equal 8, G = {(5,3), (6,2)}.
Therefore, P(Sum equals 8) = 5/36 and P(G) = 2/36.
= (2/36)/(5/36)
= 2/5
b) i) To find the probability of getting two black balls, first locate the B branch and then follow
the second B branch. Since these are independent events we can multiply the probability of each
branch.
ii) There are two outcomes where the second ball can be black, either (B, B) or (W, B)
Example:
Bag A contains 10 marbles of which 2 are red and 8 are black. Bag B contains 12 marbles of
which 4 are red and 8 are black. A ball is drawn at random from each bag.
a) Draw a probability tree diagram to show all the outcomes the experiment.
b) Find the probability that:
(i) both are red.
(ii) both are black.
(iii) one black and one red.
(iv) at least one red.
Solution:
a) A probability tree diagram that shows all the outcomes of the experiment.
P(R, R) =
(ii) both are black.
P(B, B) =
(iii) one black and one red.
P(R, B) or P(B, R) =
(iv) at least one red.
1 - P(B, B) =
Example:
A box contains 4 red and 2 blue chips. A chip is drawn at random and then replaced. A second
chip is then drawn at random.
a) Show all the possible outcomes using a probability tree diagram.
b) Calculate the probability of getting:
(i) at least one blue.
(ii) one red and one blue.
(iii) two of the same color.
Solution:
a) A probability tree diagram to show all the possible outcomes.
P(R, B) or P(B, R) =
(iii) two of the same color.
P(R, R) or P(B, B) =
Probability Tree Diagrams For Independent Events
How To Solve Probability Problems Using Probability Tree Diagrams?
Example:
A coin is biased so that it has a 60% chance of landing on heads. If it is thrown three times, find
the probability of getting
a) three heads
b) 2 heads and a tail
c) at least one head
How To Use A Tree Diagram To Calculate Combined Probabilities Of Two Independent
Events?
Example:
Jenny has a bag with seven blue sweets and 3 red sweets in it. She picks up a sweet at random
from the bag, replaces it and then picks again at random. Draw a tree diagram to represent this
situation and use it to calculate the probabilities that she picks:
(a) two red sweets
(b) no red sweets
(c) at least one blue sweet
(d) one sweet of each color
Probability Tree Diagrams For Dependent Events
How To Use A Probability Tree Diagram To Calculate Probabilities Of Two Events Which
Are Not Independent?
Example:
Jimmy has a bag with seven blue sweets and 3 red sweets in it. He picks up a sweet at random
from the bag, but does not replaces it and then picks again at random. Draw a tree diagram to
represent this situation and use it to calculate the probabilities that he picks:
(a) two red sweets
(b) no red sweets
(c) at least one blue sweet
(d) one sweet of each color
Measures of Central Tendency
Central tendency is defined as “the statistical measure that identifies a single value as representative of
an entire distribution.” It aims to provide an accurate description of the entire data. It is the single value
that is most typical/representative of the collected data. The term “number crunching” is used to
illustrate this aspect of data description. The mean, median and mode are the three commonly used
measures of central tendency.
Arithmetic mean
Arithmetic mean (or, simply, “mean”) is nothing but the average. It is computed by
adding all the values in the data set divided by the number of observations in it. If we
have the raw data, mean is given by the formula
Where, ∑ (the uppercase Greek letter sigma), X refers to summation, refers to the
individual value and n is the number of observations in the sample (sample size). The
research articles published in journals do not provide raw data and, in such a situation,
the readers can compute the mean by calculating it from the frequency distribution (if
provided).
Where, f is the frequency and X is the midpoint of the class interval and n is the number
of observations. The standard statistical notations (in relation to measures of central
tendency) are mentioned in [Table 1]. Readers are cautioned that the mean calculated
from the frequency distribution is not exactly the same as that calculated from the raw
data. It approaches the mean calculated from the raw data as the number of intervals
increase.
Table 1
ADVANTAGES
The mean uses every value in the data and hence is a good representative of the data. The
irony in this is that most of the times this value never appears in the raw data.
Repeated samples drawn from the same population tend to have similar means. The mean
is therefore the measure of central tendency that best resists the fluctuation between
different samples
DISADVANTAGES
The important disadvantage of mean is that it is sensitive to extreme values/outliers,
especially when the sample size is small. Therefore, it is not an appropriate measure of
central tendency for skewed distribution.
Mean cannot be calculated for nominal or nonnominal ordinal data. Even though mean
can be calculated for numerical ordinal data, many times it does not give a meaningful
value, e.g. stage of cancer.
Weighted mean
Weighted mean is calculated when certain values in a data set are more important than
the others. A weight wi is attached to each of the values xi to reflect this importance.
For example, When weighted mean is used to represent the average duration of stay by a
patient in a hospital, the total number of cases presenting to each ward is taken as the
weight.
Geometric Mean
It is defined as the arithmetic mean of the values taken on a log scale. It is also expressed
as the nth root of the product of an observation.
Harmonic mean
HM is appropriate in situations where the reciprocals of values are more useful. HM is
used when we want to determine the average sample size of a number of groups, each of
which has a different sample size.
Mean (Arithmetic)
The mean (or average) is the most popular and well known measure of central tendency. It can
be used with both discrete and continuous data, although its use is most often with continuous
data (see our Types of Variable guide for data types). The mean is equal to the sum of all the
values in the data set divided by the number of values in the data set. So, if we have
values in a data set and they have values …, the sample mean, usually denoted by
This formula is usually written in a slightly different manner using the Greek capitol letter,
You may have noticed that the above formula refers to the sample mean. So, why have we called
it a sample mean? This is because, in statistics, samples and populations have very different
meanings and these differences are very important, even if, in the case of the mean, they are
calculated in the same way. To acknowledge that we are calculating the population mean and not
the sample mean, we use the Greek lower case letter "mu", denoted as
The mean is essentially a model of your data set. It is the value that is most common. You will
notice, however, that the mean is not often one of the actual values that you have observed in
your data set. However, one of its important properties is that it minimises error in the prediction
of any one value in your data set. That is, it is the value that produces the lowest amount of error
from all other values in the data set.
An important property of the mean is that it includes every value in your data set as part of the
calculation. In addition, the mean is the only measure of central tendency where the sum of the
deviations of each value from the mean is always zero.
Mean of grouped data:
While calculating the mean of the grouped data, the values x1, x2, x3, ……. xn are taken as the
mid-values or the class marks of various class intervals. If the frequency distribution is inclusive,
then it should be first converted to exclusive distribution.
Solution:
We have
∑fi = 1 + 2 + 2 + 4 + 6 + 2 + 3 = 20
∑fi xi =1 + 6 + 10 + 28 + 54 + 22 + 39 = 160
Median
The median is the middle score for a set of data that has been arranged in order of magnitude.
The median is less affected by outliers and skewed data. In order to calculate the median,
suppose we have the data below:
65 55 89 56 35 14 56 55 87 45 92
We first need to rearrange that data into order of magnitude (smallest first):
14 35 45 55 55 56 56 65 87 89 92
Our median mark is the middle mark - in this case, 56 (highlighted in bold). It is the middle mark
because there are 5 scores before it and 5 scores after it. This works fine when you have an odd
number of scores, but what happens when you have an even number of scores? What if you had
only 10 scores? Well, you simply have to take the middle two scores and average the result. So,
if we look at the example below:
65 55 89 56 35 14 56 55 87 45
14 35 45 55 55 56 56 65 87 89
Only now we have to take the 5th and 6th score in our data set and average them to get a median
of 55.5.
Mode
The mode is the most frequent score in our data set. On a histogram it represents the highest bar
in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most
popular option. An example of a mode is presented below:
Mean of Grouped Data
Mean of grouped data is the data set formed by aggregating individual observations of a variable
into different groups. Grouped data is data that is grouped together in different categories. Mean
is considered as the average of the data. For the mean of grouped data, it might be difficult to
find the exact value however, we can always estimate it. Let us learn more about the mean of
grouped data, the methods to find the mean of grouped data, and solve a few examples to
understand this concept better.
Definition of Mean
The mean is the average or a calculated central value of a set of numbers that is used to measure
the central tendency of the data. Central tendency is the statistical measure that recognizes the
entire set of data or distribution through a single value. In statistics, the mean can also be defined
as the sum of all observations to the total number of observations. Given a data set,
X=x1,x2,...,xn
, the mean (or arithmetic mean, or average), denoted x̄ , is the mean of the n values x1,x2,...,xn
The mean formula is defined as the sum of the observations divided by the total number of
observations. There are two different formulas for calculating the mean for ungrouped data and the
mean for grouped data. Let us look at the formula to calculate the mean of grouped data. The formula
is: x̄ = Σfi/N
Where,
N = sum of frequencies
Data
To calculate the mean of grouped data we have three different methods - direct method, assumed
mean method, and step deviation method. The mean of grouped data deals with the frequencies
of different observations or variables that are grouped together. Let us look at each of these
methods separately.
Direct Method
The direct method is the simplest method to find the mean of the grouped data. If the values of
the observations are x1
, x2, x3,.....xn with their corresponding frequencies are f1, f2, f3,.....fn
x̄ = x1
x̄ = ∑xi
fi / ∑fi
, where i = 1, 2, 3, 4,......n
Here are the steps that can be followed to find the mean for grouped data using the direct
method,
Create a table containing four columns such as class interval, class marks (corresponding),
denoted by xi frequencies fi (corresponding), and xifi
Calculate Mean by the Formula Mean = ∑xifi / ∑fi. Where fi is the frequency and xi
is the midpoint of the class interval.
Calculate the midpoint, xi, we use this formula xi = (upper class limit + lower class limit)/2.
Class Interval 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50
Frequency (fi 9 13 8 15 10
Solution: The first step is to create the table with the midpoint or marks and the product of the
frequency and midpoint. To calculate the midpoint we find the average between the class interval
by using the formula mentioned above.
Midpoint xi
Question 1
The following data represent the annual rainfall distribution in St. Louis, Missouri, for a sample
of 25 years from 1870 to 2004.
Rainfall (inches) Number of Years
20 - 24 1
25 - 29 3
30 - 34 5
35 - 39 8
40 - 44 5
45 - 49 2
50 - 54 0
55 - 59 1
Required
The following data show the various runners and the time they take to complete a race.
Seconds Frequency
51 – 55 2
56 – 60 7
61 – 65 8
66 – 70 4
The groups (51-55, 56-60, etc), also called class intervals, are of width 5
The midpoints are in the middle of each class: 53, 58, 63 and 68
Midpoint Frequency Midpoint ×
x f Frequency
fx
53 2 106
58 7 406
63 8 504
68 4 272
Totals: 21 1288
And then our estimate of the mean time to complete the race is:
Estimated Mean = 128821 = 61.333
Estimating the Median from Grouped Data
Seconds Frequency
51 - 55 2
56 - 60 7
61 - 65 8
66 - 70 4
Median formular N+1/2
The median is the middle value, which in our case is the 11th one, which is in the 61 - 65 group:
We can say "the median group is 61 - 65"
where:
56 - 60 7
61 - 65 8
66 - 70 4
We can easily find the modal group (the group with the highest frequency), which is 61 - 65
We can say "the modal group is 61 - 65"
But the actual Mode may not even be in that group! Or there may be more than one mode.
Without the raw data we don't really know.
But, we can estimate the Mode using the following formula:
Estimated Mode = L + f1-Fo/(2(fi)-fo-f2)× w
where:
L is the lower class boundary of the modal group
In this example:
L = 60.5
fm-1 = 7
fm = 8
fm+1 = 4
w=5
Now, what is linear programming? Linear programming is a simple technique where we depict
complex relationships through linear functions and then find the optimum points. The important
word in the previous sentence is depicted. The real relationships might be much more complex –
but we can simplify them to linear relationships.
Applications of linear programming are everywhere around you. You use linear programming at
personal and professional fronts. You are using linear programming when you are driving from
home to work and want to take the shortest route. Or when you have a project delivery you make
strategies to make your team work efficiently for on-time delivery.
Let’s say a FedEx delivery man has 6 packages to deliver in a day. The warehouse is located at
point A. The 6 delivery destinations are given by U, V, W, X, Y, and Z. The numbers on the
lines indicate the distance between the cities. To save on fuel and time the delivery person wants
to take the shortest route.
So, the delivery person will calculate different routes for going to all the 6 destinations and then
come up with the shortest route. This technique of choosing the shortest route is called linear
programming.
Operation Research
the objective of the delivery person is to deliver the parcel on time at all 6 destinations. The
process of choosing the best route is called Operation Research. Operation research is an
approach to decision-making, which involves a set of methods to operate a system. In the above
example, my system was the Delivery model.
Linear programming is used for obtaining the most optimal solution for a problem with given
constraints. In linear programming, we formulate our real-life problem into a mathematical
model. It involves an objective function, linear inequalities with subject to constraints.
Is the linear representation of the 6 points above representative of the real-world? Yes and No. It
is an oversimplification as the real route would not be a straight line. It would likely have
multiple turns, U-turns, signals and traffic jams. But with a simple assumption, we have reduced
the complexity of the problem drastically and are creating a solution that should work in most
scenarios.
Let us define some terminologies used in Linear Programming using the above example.
Decision Variables: The decision variables are the variables that will decide my output.
They represent my ultimate solution. To solve any problem, we first need to identify the
decision variables. For the below example, the total number of units for A and B denoted
by X & Y respectively are my decision variables.
Constraints: The constraints are the restrictions or limitations on the decision variables.
They usually limit the value of the decision variables. In the below example, the limit on
the availability of resources Milk and Choco are my constraints.
Non-negativity restriction: For all linear programs, the decision variables should always
take non-negative values. This means the values for decision variables should be greater
than or equal to 0.
The process to formulate a Linear Programming problem
5. Solve the linear programming problem using either the simplex or graphical method.
For a problem to be a linear programming problem, the decision variables, objective function and
constraints all have to be linear functions.
If all the three conditions are satisfied, it is called a Linear Programming Problem.
Example: Consider a chocolate manufacturing company that produces only two types of
chocolate – A and B. Both the chocolates require Milk and Choco only. To manufacture each
unit of A and B, the following quantities are required:
The company kitchen has a total of 5 units of Milk and 12 units of Choco. On each sale, the
company makes a profit of
Now, the company wishes to maximize its profit. How many units of A and B should it produce
respectively?
Solution: The first thing I’m gonna do is represent the problem in a tabular form for better
understanding.
Milk Choco Profit per unit
A 1 3 Rs 6
B 1 2 Rs 5
Total 5 12
The total profit the company makes is given by the total number of units of A and B produced
multiplied by its per-unit profit of Rs 6 and Rs 5 respectively.
The company will try to produce as many units of A and B to maximize the profit. But the
resources Milk and Choco are available in a limited amount.
As per the above table, each unit of A and B requires 1 unit of Milk. The total amount of Milk
available is 5 units. To represent this mathematically,
X+Y ≤ 5
Also, each unit of A and B requires 3 units & 2 units of Choco respectively. The total amount of
Choco available is 12 units. To represent this mathematically,
3X+2Y ≤ 12
For the company to make maximum profit, the above inequalities have to be satisfied.
A graphical method involves formulating a set of linear inequalities subject to the constraints.
Then the inequalities are plotted on an X-Y plane. Once we have plotted all the inequalities on a
graph the intersecting region gives us a feasible region. The feasible region explains what all
values our model can take. And it also gives us the optimal solution.
Example: A farmer has recently acquired a 110 hectares piece of land. He has decided to grow
Wheat and barley on that land. Due to the quality of the sun and the region’s excellent climate,
the entire production of Wheat and Barley can be sold. He wants to know how to plant each
variety in the 110 hectares, given the costs, net profits and labor requirements according to the
data shown below:
Wheat 100 50 10
The farmer has a budget of US$10,000 and availability of 1,200 man-days during the planning
horizon. Find the optimal solution and the optimal value.
Solution: To solve this problem, first we gonna formulate our linear program.
1. It is given that the farmer has a total budget of US$10,000. The cost of producing Wheat and
Barley per hectare is also given to us. We have an upper cap on the total cost spent by the farmer.
So our equation becomes:
2. The next constraint is the upper cap on the availability of the total number of man-days for the
planning horizon. The total number of man-days available is 1200. As per the table, we are given
the man-days per hectare for Wheat and Barley.
3. The third constraint is the total area present for plantation. The total available area is 110
hectares. So the equation becomes,
X + Y ≤ 110
The values of X and Y will be greater than or equal to 0. This goes without saying.
X ≥ 0, Y ≥ 0
To plot for the graph for the above equations, first I will simplify all the equations.
Plot the first 2 lines on a graph in the first quadrant (like shown below)
The optimal feasible solution is achieved at the point of intersection where the budget & man-
days constraints are active. This means the point at which the equations X + 2Y ≤ 100 and X +
3Y ≤ 120 intersect gives us the optimal solution.
The values for X and Y which gives the optimal solution is at (60,20).
To maximize profit the farmer should produce Wheat and Barley in 60 hectares and 20 hectares
of land respectively.
= US$5400
THE SIMPLEX METHOD
1. Set up the problem. That is, write the objective function and the inequality constraints.
2. Convert the inequalities into equations. This is done by adding one slack variable for
each inequality.
3. Construct the initial simplex tableau. Write the objective function as the bottom row.
4. The most negative entry in the bottom row identifies the pivot column.
5. Calculate the quotients. The smallest quotient identifies a row. The element in the
intersection of the column identified in step 4 and the row identified in this step is
identified as the pivot element. The quotients are computed by dividing the far right
column by the identified column in step 4. A quotient that is a zero, or a negative number,
or that has a zero in the denominator, is ignored.
6. Perform pivoting to make all other entries in this column zero. This is done the same
way as we did with the Gauss-Jordan method.
7. When there are no more negative entries in the bottom row, we are finished;
otherwise, we start again from step 4.
8. Read off your answers. Get the variables using the columns with 1 and 0s. All other
variables are zero. The maximum value you are looking for appears in the bottom right
hand corner.
Now, we use the simplex method to solve Example 3.1.1 solved geometrically in section 3.1.
Example 4.2.1
Niki holds two part-time jobs, Job I and Job II. She never wants to work more than a total of 12
hours a week. She has determined that for every hour she works at Job I, she needs 2 hours of
preparation time, and for every hour she works at Job II, she needs one hour of preparation time,
and she cannot spend more than 16 hours for preparation. If she makes $40 an hour at Job I, and
$30 an hour at Job II, how many hours should she work per week at each job to maximize her
income?
Solution
In solving this problem, we will follow the algorithm listed above.
STEP 1. Set up the problem. Write the objective function and the constraints.
Since the simplex method is used for problems that consist of many variables, it is not practical
to use the variables x, y, z etc. We use symbols x1, x2, x3
, and so on.
Let
x1 = The number of hours per week Niki will work at Job I and
x2= The number of hours per week Niki will work at Job II.
It is customary to choose the variable that is to be maximized as Z
The problem is formulated the same way as we did in the last chapter.
Maximize Subject to: Z=40x1+30x2x1+x2≤122x1+x2≤16x1≥0;x2≥0
STEP 2. Convert the inequalities into equations. This is done by adding one slack variable for
each inequality.
For example to convert the inequality x1+x2≤12
into an equation, we add a non-negative variable y1
, and we get
x1+x2+y1=12
Here the variable y1 picks up the slack, and it represents the amount by which x1+x2 falls short
of 12. In this problem, if Niki works fewer than 12 hours, say 10, then y1 is 2. Later when we
read off the final solution from the simplex table, the values of the slack variables will identify
the unused amounts.
We rewrite the objective function Z=40x1+30x2
as −40x1−30x2+Z=0
.
After adding the slack variables, our problem reads
Objectivefunction Subject to
constraints: −40x1−30x2+Z=0x1+x2+y1=122x1+x2+y2=16x1≥0;x2≥0
STEP 3. Construct the initial simplex tableau. Each inequality constraint appears in its own
row. (The non-negativity constraints do not appear as rows in the simplex tableau.) Write the
objective function as the bottom row.
Now that the inequalities are converted into equations, we can represent the problem into an
augmented matrix called the initial simplex tableau as follows.
Here the vertical line separates the left hand side of the equations from the right side. The
horizontal line separates the constraints from the objective function. The right side of the
equation is represented by the column C.
The reader needs to observe that the last four columns of this matrix look like the final matrix for
the solution of a system of equations. If we arbitrarily choose x1=0
and x2=0
, we get
y1100y2010Z001
which reads
y1=12y2=16Z=0
The solution obtained by arbitrarily assigning values to some variables and then solving for the
remaining variables is called the basic solution associated with the tableau. So the above
solution is the basic solution associated with the initial simplex tableau. We can label the basic
solution variable in the right of the last column as shown in the table below.
STEP 4. The most negative entry in the bottom row identifies the pivot column.
The most negative entry in the bottom row is -40; therefore the column 1 is identified.
Question Why do we choose the most negative entry in the bottom row?
Answer The most negative entry in the bottom row represents the largest coefficient in the
objective function - the coefficient whose entry will increase the value of the objective function
the quickest.
The simplex method begins at a corner point where all the main variables, the variables that have
symbols such as x1
, x2, x3 etc., are zero. It then moves from a corner point to the adjacent corner point always
increasing the value of the objective function. In the case of the objective function Z=40x1+30x2,
it will make more sense to increase the value of x1 rather than x2. The variable x1 represents the
number of hours per week Niki works at Job I. Since Job I pays $40 per hour as opposed to Job
II which pays only $30, the variable x1 will increase the objective function by $40 for a unit of
increase in the variable x1
.
STEP 5. Calculate the quotients. The smallest quotient identifies a row. The element in the
intersection of the column identified in step 4 and the row identified in this step is identified
as the pivot element.
Following the algorithm, in order to calculate the quotient, we divide the entries in the far right
column by the entries in column 1, excluding the entry in the bottom row.
The smallest of the two quotients, 12 and 8, is 8. Therefore row 2 is identified. The intersection
of column 1 and row 2 is the entry 2, which has been highlighted. This is our pivot element.
Question Why do we find quotients, and why does the smallest quotient identify a row?
Answer When we choose the most negative entry in the bottom row, we are trying to increase the
value of the objective function by bringing in the variable x1
. But we cannot choose any value for x1. Can we let x1=100? Definitely not! That is because
Niki never wants to work for more than 12 hours at both jobs combined: x1+x2≤12. Can we let
x1=12
? Again, the answer is no because the preparation time for Job I is two times the time spent on
the job. Since Niki never wants to spend more than 16 hours for preparation, the maximum time
she can work is 16 ÷ 2 = 8.
Now you see the purpose of computing the quotients; using the quotients to identify the pivot
element guarantees that we do not violate the constraints.
Question Why do we identify the pivot element?
Answer As we have mentioned earlier, the simplex method begins with a corner point and then
moves to the next corner point always improving the value of the objective function. The value
of the objective function is improved by changing the number of units of the variables. We may
add the number of units of one variable, while throwing away the units of another. Pivoting
allows us to do just that.
The variable whose units are being added is called the entering variable, and the variable whose
units are being replaced is called the departing variable. The entering variable in the above
table is x1
, and it was identified by the most negative entry in the bottom row. The departing variable y2
was identified by the lowest of all quotients.
STEP 6. Perform pivoting to make all other entries in this column zero.
In chapter 2, we used pivoting to obtain the row echelon form of an augmented matrix. Pivoting
is a process of obtaining a 1 in the location of the pivot element, and then making all other
entries zeros in that column. So now our job is to make our pivot element a 1 by dividing the
entire second row by 2. The result follows.
To obtain a zero in the entry first above the pivot element, we multiply the second row by -1 and
add it to row 1. We get
To obtain a zero in the element below the pivot, we multiply the second row by 40 and add it to
the last row.
We now determine the basic solution associated with this tableau. By arbitrarily choosing x2=0
and y2=0, we obtain x1=8, y1=4, and z=320
. If we write the augmented matrix, whose left side is a matrix with columns that have one 1 and
all other entries zeros, we get the following matrix stating the same thing.
⎡⎣⎢⎢⎢⎢x1010y1100Z001||||C48320⎤⎦⎥⎥⎥⎥
We can restate the solution associated with this matrix as x1=8
, x2=0, y1=4, y2=0 and z=320. At this stage of the game, it reads that if Niki works 8 hours at Job
I, and no hours at Job II, her profit Z will be $320. Recall from Example 3.1.1 in section 3.1 that
(8, 0) was one of our corner points. Here y1=4 and y2=0
mean that she will be left with 4 hours of working time and no preparation time.
STEP 7. When there are no more negative entries in the bottom row, we are finished;
otherwise, we start again from step 4.
Since there is still a negative entry, -10, in the bottom row, we need to begin, again, from step 4.
This time we will not repeat the details of every step, instead, we will identify the column and
row that give us the pivot element, and highlight the pivot element. The result is as follows.
We no longer have negative entries in the bottom row, therefore we are finished.
Question Why are we finished when there are no negative entries in the bottom row?
Answer The answer lies in the bottom row. The bottom row corresponds to the equation:
0x1+0x2+20y1+10y2+Z=400 or z=400−20y1−10y2
Since all variables are non-negative, the highest value Z
can ever achieve is 400, and that will happen only when y1 and y2
are zero.
STEP 8. Read off your answers.
We now read off our answers, that is, we determine the basic solution associated with the final
simplex tableau. Again, we look at the columns that have a 1 and all other entries zeros. Since
the columns labeled y1
and y2 are not such columns, we arbitrarily choose y1=0, and y2=0
, and we get
⎡⎣⎢⎢⎢⎢x1010x2100Z001||||C84400⎤⎦⎥⎥⎥⎥
The matrix reads x1=4
, x2=8 and z=400
.
The final solution says that if Niki works 4 hours at Job I and 8 hours at Job II, she will
maximize her income to $400. Since both slack variables are zero, it means that she would have
used up all the working time, as well as the preparation time, and none will be left.