0% found this document useful (0 votes)
21 views14 pages

Chapter 1 Updated

Uploaded by

mushfiq2808
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views14 pages

Chapter 1 Updated

Uploaded by

mushfiq2808
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Manju, Associate professor, CSE, IIUC

Dr. Mohammad Manjur Alam (Manju)


Associate Professor
Department of Computer Science and Engineering
International Islamic University Chittagong.
Email: [email protected]

CHAPTER ONE
INTRODUCTION OF STATISTICS
Definition of Statistics

Statistics is the science of data. It enables the collection, organization, presentation, analysis, and
interpretation of numerical data.

Dr. A.L. Bowely defined “Statistics are numerical statement of facts in any department of enquiry placed
in relation to each other”.

R.A. Fisher defined, “The science of statistics is essentially a branch of applied mathematics and may be
regarded as mathematics applied to observational data”.

According to Croxton and Cowden, “Statistics may be defined as the science of collection, presentation,
analysis and interpretation of numerical data”.

Origin and Development of Statistics


The word 'Statistics' seems to have been derived from the Latin word 'status' or the Italian word 'statista'
or the German word 'statistik' each of which means a 'political state'. In ancient times, the government used
to collect the information regarding the population and 'property of wealth' of the country- the
former enabling the government to have an idea of the manpower of the country and the latter providing it
a basis for introducing news taxes and levies.
Major area of Statistics:

1. Descriptive Statistics
2. Inferential Statistics.

Descriptive Statistics

Descriptive statistics involves methods of organizing, picturing and summarizing information


from data.

Inferential Statistics.

Inferential statistics involves methods of using information from a sample to draw conclusions
about the population.

1
Manju, Associate professor, CSE, IIUC

Uses and Importance of Statistics to Computer Students


Statistics is primarily used either to make predictions based on the data available or to make conclusions
about a population of interest when only sample data is available. In both cases statistics tries to make
sense of the uncertainty in the available data. When making predictions statisticians determine if the
difference in the data points are due to chance or if there is asystematic relationship. The more the
systematic relationship that is observed the better the prediction a statistician can make. The more random
error that is observed the more uncertain the prediction. Statisticians can provide a measure of the
uncertainty to the prediction. When making inference about a population, the statistician is trying to
estimate how good a summary statistic of a sample really is at estimating a population statistic. For
computer students, knowing the basic principles and methods in statistics could help them in doing their
research work like comparing the speed of internet connection in different countries and the probability of
how many times does each experience the same level of internet connection speed in a week, month or
year. It could also be helpful in determining the best operating system to use. Whenever there is the need
to compare data and know the best option that we should take statistics can give the answer.

Functions of statistics:

1. It simplifies much of figure.


2. It facilitates classification and comparison of data.
3. It helps in determining the relationships two or more phenomenon.
4. It helps in formulating and testing suitable hypothesis.
5. It helps a central management in formulating suitable future policy.
6. It helps in predicting future trends.

Limitations of Statistics:

1. Statistics deals with aggregate of items and not with individual item or measurement.
2. Statistics deals only with quantitative characteristics.
3. Statistical laws hold good only for the averages.
4. It plays only an auxiliary rule in summarizing a fact.
5. Statistics can be misused.

Characteristics of statistics:

The characteristics of statistics are given below:

1. Statistics should deal with aggregate of individual rather than with individual alone.

2. Statistics should be expressed as numerical figure.

3. Statistics should have the property of being varied by multiplicity of causes.

2
Manju, Associate professor, CSE, IIUC

4. Statistics are collected or estimated should be reasonable of accuracy.

5. Statistics should be obtained by pre-determined purpose.

6. Statistics are collected is a systematic manner.

Importance of statistics:

The importance of statistics is given below:

1. Statistics of wealth and manpower are important for development and planning.

2. Statistics are invaluable in business and commerce.

3. Statistics helps the planner to estimate the revenue income and expenditure of the country.

4. Agriculture Statistics may play a key role in agriculture development.

5. In industry, statistics is widely used to provide quality control.

6. Statistics is usually used in education and psychology too.

Relation between Computer and Statistics:

Statistics is defined as the science of collecting, organizing, and interpreting numerical facts, which we call
data. It is very important for a student of computer sciences. As Computer sciences also deals with
organization and interpretation of numerical facts. In fact most of the principles of computer sciences are
based on concepts of statistics.
The computer can process large amount of data quickly and accurately. For proc essing the large
amount of data some of the important statistical packages that have been used are – SPSS, SAS,
STATA, S-plus and MINITAB .
Population with example:

The totality of all elements under the study or discussion is called population. The “population” in
statistics includes all members of a defined group that we are studying or collecting information.

Example: If we measure the heights and weights of a group of person then it is called population.

Populations are two types: (i) Finite population;

(ii) Infinite population.

Finite population: A Population is called finite population if its elements are countable. Example:
Number of students in a university.

Infinite population: A population is called infinite population if its elements are not countable.

Example: Number of fishes in the Bay of Bengal.

3
Manju, Associate professor, CSE, IIUC

Sample with example:

A representative part of population is called sample Or a part of the population is called sample.

Example: If we measure the height and weight of IIUC student, then that of the CSE/EEE/ETE department
student are sample.

Difference between a population and a sample

The main difference between population and sample are as follows:


Population Sample

# The totality of all elements under the study or A part of the population is called sample.
discussion is called population.
# A population includes each element from the set A sample consists only of observations drawn
of observations that can be made. from the population.

# Population may be finite or infinite Sample must be finite.


# Collecting data from every element of a Collecting data is easy.
population is not easy.
# All registered voters in our country All registered voters in Chittagong district

** Figure of population and sample

Variable with example:

Any phenomenon which varies from individual to individual is called variable. Variables are represented
by symbols (e.g., x, y, or z).

For example Age, weight, height, sex, income and expenses, country of birth, capital expenditure, class
grades, eye colour and vehicle type are examples of variables.

Variables can be classified as qualitative (categorical) or quantitative (numeric).

4
Manju, Associate professor, CSE, IIUC

Qualitative variable:
A variable which cannot be expressed as numerically is called qualitative (categorical) variable.
Examples: Hair color, gender, field of study, college attended, political affiliation, status of disease
infection, sex, business type, eye colour, religion and brand.

Quantitative variable:

A variable which can be expressed as numerically then it is called quantitative variable.

Examples: Height, age, crop yield, GPA, salary, temperature, area, air pollution index (measured in parts
per million), etc.

Quantitative variables can be further classified as discrete or continuous.

Discrete variable:

A variable is called discrete variable if it can take only isolated (Whole) values. Examples of discrete
variables include the number of registered cars, number of business locations, and number of children in a
family, all of which measured as whole units (i.e. ,1, 2, 3 cars).

Continuous variable:

A variable is called continuous variable if it can take any values between certain limits. i.e. Observations
can take any value between a certain set of real numbers. Examples of continuous variables include
height, time, age, and temperature.

N.B: Every continuous variable are discrete but every discrete variable are not continuous.

Figure of different types of variable:

Qualitative variable and quantitative variable:

Qualitative variable Quantitative variable


Cannot be measured numerically Can be measured numerically.
It is not countable It is countable
Qualitative variable can be measure by using Quantitative variable can be measure by using
nominal and ordinal scale interval and ratio scale
The value of qualitative variable is generally The value of quantitative variable is discrete and
discrete. continuous
Algebraic expression meaningless Algebraic expression meaningful

5
Manju, Associate professor, CSE, IIUC

Skin colour, hair colour, religion, gender, merit, GPA, age, weight, height, income, temperature
education, character are the example of etc are the example of quantitative variable.
qualitative variable.

Discrete variable and continuous variable:

Discrete variable Continuous variable


A variable is called discrete variable if it can take A variable is called continuous variable if it can
only isolated values take any values between certain limits
It is countable It is measurable
The value of discrete variable may be finite or It must be infinite
infinite
Family member, student number are discrete age, weight, height, salary etc are continuous
variable variable

Constant:

Ans: A number that is not changing. It is usually denoted by a, b, c or d.

Example: Number of finger in a hand or leg.

Variable Constant
A variable is always subject to change. A constant will not change, ever.
It is denoted by X,Y,Z,U,V It is denoted by a, b ,c or d
Variable are qualitative and quantitative Constant has no such classification
age, weight, height, salary etc are variable Total number of days in a week, Number of
finger in a hand or leg are constant

Statistical Data

Ans: A set of observations obtained from a particular enquiry is called data.

Example: Income of workers or examination marks of a students.

According to sources data are:

1. Primary data;
2. Secondary data;

Primary data: Data collected by the investigator himself/ herself for a specific purpose.

Example: Data collected by different govt., public, private organizations, research bodies, research
scholars, NGO’s for their official records and research purpose from the field directly are primary data.

Secondary data: When an investigator uses the data which has already been collected by others, such data
are called secondary data.

This data can be obtained from journals, report, Internet, Books/ Magazines, Newspapers, Office statistics
the government statistics service, the office of national statistics, centre for applied social surveys.

6
Manju, Associate professor, CSE, IIUC

Methods of collecting primary data:

1. Through interview
2. Through questionnaire
3. Through schedule
4. Through local agent.
5. Through observations
6. Through experimentation.

Difference between Primary and Secondary Data

Primary Data

1. Primary data are always original as it is collected by the investigator.

2. Suitability of the primary data will be positive because it has been systematically collected.

3. Primary data are expensive and time consuming.

4. Extra precautions are not required.

5. Primary data are in the shape of raw material.

6. Possibility of personal prejudice.

Secondary Data

1. Secondary data lacks originality. The investigator makes use of the data collected by other agencies.

2. Secondary data may or may not suit the objects of enquiry.

3. Secondary data are relatively cheaper.

4. It is used with great care and caution.

5. Secondary data are usually in the shape of readymade products.

6. Possibility of lesser degree of personal prejudice.

Questionnaire:

In any survey the information are collected according to some predetermined question. A set of question
for any survey constitutes the questionnaire.

Or, A set of printed or written questions with a choice of answers, devised for the purposes of a survey or
statistical study.

7
Manju, Associate professor, CSE, IIUC

Characteristics of a good Questionnaire:

1. Questions worded simply and clearly, not ambiguous or vague

2. Write an introduction to the questionnaire

3. Question should be logically arranged.

4. Question should be as few as possible.

5. Question should be easy to understand.

6. Question should not be of multiple meaning.

7. Question should be capable of objective answer.

8. Question should be well designed.

9. Personal question should be avoided.

10. Write a descriptive title for the questionnaire

SAMPLE QUESTIONNAIRE:

Questionnaire on “Socio-Economic background and academic performance Of IIUC students”.

1.Students Name: ID:


2. Semester 3. Department:

4. Mobile number (if possible): 5.Gender: (a) male, (b) Female


6. Age:………………….. 7. Religion: (a) Islam (b) Hindu (C)Others
8. Home District: 9. Number of brothers and sisters:…………
10. What is your serial number as a son/daughter 11.Areas of your School at SSC or equivalent
of your parents? (a) First, (b) second, (c) third,(d) level : (a) city, (b) small town (c) village
fourth (e) ___________
12. Areas of your School at HSC or equivalent 13. Did you pass (a) SSC/HSC, (b) O-level/A-
level : (a) city, (b) small town (c) village level,(c) Madrasa qualifications (d) others
14. SSC and HSC (or equivalent) GPA: SSC:_____ 15. Father’s Education: (a) No education
and HSC……….. (b) Primary, (c) Secondary, (d) Higher Secondary
(e) University.
16. Mother’s Education: (a) No education (b) 17. What is ( or was ) your father’s occupation:
Primary, (c) Secondary, (d) Higher Secondary (a) Agriculture (b) Business (c) Service (d)
(e) university. teacher and (e) Others.

18. What is ( or was ) your mother’s occupation: 19. Your father and mother monthly
(a) Housewife (b) Business (c) Service (d) teacher income:………………..
and (e) Others.
20. Economic status of your family:(a) Poor, (b) 21.Behind Choosing IIUC: (a) Islamic (b) Tuition
lower middle class (c) middle class, (d) higher middle (c) Safety and distance (d) Faculty (e)Good
class, (e)rich private university(f) Scholarship (g) others.

8
Manju, Associate professor, CSE, IIUC

22. How you got information about IIUC: 23.While attending university, I live: (a) In a hall
(a) Advertisement, (b) Faculty (c) (b) At home/ Family (c) Mess (d) Relative/friends
friends/relatives, (d) internet (e) others
24. Approximately, what was your grade average in 25. Do you discuss your grades with your
your final year of IIUC:…………..out of 4.00. guardian? (a) Yes (b)No
26.Compared to my friends at university, I am, on an 27. How often did you miss classes: (a) Always
academic basis, performing:(a) Better (b) Same (b) Sometimes (C) Never (d) Once every week?
(c)Worse
28. Usually I study: (a) Library (b) Class (c) my 29. Generally, I study…………….hours daily.
room (d) If other please Specify:………….
30.Are you satisfied in IIUC academic system: 31. Do you seem that tuition fee is higher in
(a) yes (b) No. IIUC? (a) yes (b) No.
32. What's your dream Career? (a) Subject related
job(b) Teacher (c) Banker (d) service (e) others.
THANK-YOU

Types of Questionnaire:

There are two types of questionnaire:

1. Unstructured question.
2. Structured question.

Unstructured question: Unstructured question are open-ended, that respondents can answer in their own
words.

Example: What is your occupation?

Structured question: A question in which the respondents is given specific limited alternative responses
and ask to choose one to his/her own view point.

Example: What is your occupation?

1. Agriculture 2. Business 3. Service 4. Teacher and 5. Others.

Methods of organization of data:

1. Classification
2. Tabulation
3. Frequency distribution

Classification: Classification is the first statistical technique to condense the raw data. It is the process of
arranging data in different groups or classes according to their affinities.

Tabulation: Tabulation is a logical and systematic arrangement of statistical data in raws and columns.

Frequency:

The number of times that a given value occurs into each group / class is known as frequency.

For example, if four students have a score of 80 in mathematics, and then the score of 80 is said to have a
frequency of 4. The frequency of a data value is often represented by f.

9
Manju, Associate professor, CSE, IIUC

Frequency distribution

Arrangement of observations according to frequencies is called frequency distribution.

Construction of a frequency distribution:

Following are the steps for the construction of a frequency distribution:

1. Class limits/Range: The class limits are the lowest and highest values that can be included in the class.

2. Number of classes: K = 1+3.322 log10N where N is the number of observations,

3. Class interval: C.I = Range /K where, Range = H.V - L.V.

4. Class frequency/Tally: The number of observations corresponding to the particular class is known as
class frequency of that class.

5. Mid value: (U.L+L.L) /2

6. Cumulative frequency: The number of observation for less then a certain upper limit of the class, if
cumulation is done from top of the table.

Assignment-1: The following data refer to the ages of 60 employees of a firm.

33, 41, 21, 25, 36, 38, 35, 36, 35, 37, 42, 30, 35, 37, 36, 38, 30, 54, 40, 48, 15, 28, 51, 42, 25, 41, 30, 27,
42, 36, 28, 26, 37, 54, 44, 31, 36, 40, 36, 22, 30, 31, 19, 48, 16, 42, 32, 21, 22, 40, 43, 42, 39, 38, 37, 33,
49, 47, 46, 48.

(i) Construct a frequency table with suitable class interval.


(ii) Draw histogram, frequency polygon and ogive.

Tips: As like as class lecture.

Assignment-2: Monthly income (in Lac Tk.) of 30 firms in a certain area is given below:

20,19,10,13, 8,15,9,16,18,8,16,17,12,11,10,19,18,17,14,15,12,13,14,12,12.5,18.5,20,15,18,16.

(i) Construct a suitable frequency distribution.

(ii) Draw histogram and ogive.

Assignment-3: Tube light production (hourly) of 30 machines is given below:

9,20,19,10,12,15,9,16,18,8,16,17,12,11,10,19,18,17,14,15,12,13,14,12,12,18,20,15,18,15.

(i) Construct a frequency distribution with suitable class interval.


(ii) Draw ogive curve.

Example 6

The number of calls from motorists per day for roadside service was recorded for the month of December
2003. The results were as follows:

10
Manju, Associate professor, CSE, IIUC

Set up a frequency table for this set of data values.

Graphical representation of statistical data

In addition to presentation of statistical data through tabular form, one can present the same through some
visual aids refer to graphs and diagrams.

Types of Graphs and Diagrams:

1. Histogram;
2. Bar diagram;
3. Frequency polygon
4. Pie diagram;
5. Scatter diagram;
6. Line diagram;
7. Ogive;
8. Steam and leaf plot.
9. Box plot

Histogram:

*A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete
intervals (bins), with an area equal to the frequency of the observations in the interval.

Bar chart:

A bar chart is a chart with rectangular bars with lengths proportional to the values that they represent. The
bars can be plotted vertically or horizontally.

Frequency polygon

11
Manju, Associate professor, CSE, IIUC

Pie chart:

A pie chart shows percentage values as a slice of a pie.

Scatter diagram:

12
Manju, Associate professor, CSE, IIUC

Line chart:

A line chart is a two-dimensional scatter plot of ordered observations where the observations are connected
following their order.

Box plot

Cumulative frequency polygon or ogive

Figure: Ogive by using upper class interval

13
Manju, Associate professor, CSE, IIUC

Figure: Ogive by using lower class interval

Figure: Ogive by using upper and lower class interval

The difference between graph and diagram:-

Diagram

1) Diagram can be drawn on plain paper and any sort of paper.


2) Diagram is more effective and impressive.
3) Diagram have everlasting effect.
4) Diagram cannot be used as median, mode etc.
5) Diagram can be represented as an approximate idea.

Graph

1) Graph can be drawn only on plain paper.


2) Graph is not more effective and impressive.
3) Graph doesn’t have everlasting effect.
4) Graph can be used as median, mode etc.
5) Graph cannot be represented as an approximate idea.

14

You might also like