0% found this document useful (0 votes)
24 views

Statistics Lecture Notes - UNIT 1

class notes

Uploaded by

mphephubono9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Statistics Lecture Notes - UNIT 1

class notes

Uploaded by

mphephubono9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT 1: INTRODUCTION

OUTCOMES
This unit deals with the role of statistics in the data analysis process. Concepts that are basic to the study of
statistics are discussed.

After completion of this unit you will be able to:


 recognize the role of statistics in life
 understand the language of statistics
 select suitable measuring scales for different types of data
 understand the role of computers in statistics

We live in an era where we are faced with increasing amounts of information, referred to as data. To perform
many tasks efficiently we need a basic understanding of statistical methods. The field of statistics covers a
problem-solving process that seeks answers to questions through data. To be an informed consumer of
information you must be able to:
 extract information from tables and graphs
 follow numerical arguments
 understand the basics of how data should be gathered, summarised and analysed to draw statistical
conclusions

1.1. PROBLEM-SOLVING STEPS


Solving a statistical problem typically comprises the following steps:
1. Identify the problem.
2. Collect the information (data): how to measure it; what is the appropriate data source (existing or new);
use population or sample; which sampling method?
3. Organize and summarise the information: tables, graphs, numerical summaries; understand important
characteristics of the data which gives guidance in selecting appropriate methods for further analysis.
4. Interpret results: draw conclusions, make recommendations, assess the risk of an incorrect decision.
This involves generalizing from a sample of individuals or objects that we have studied to a larger
population.

1
Example 1.1
As part of a weekly check to access the calibration of a filling machine, the quality control manager randomly
selects 50 bottles of beer that were filled on a specific day.

1. Identify the problem

2. Collect the information

3. Organise and summarise the information

4. Interpret results

Key components of statistical thinking


 Use data whenever possible to guide the analysis
 Look for connections and relationships
 Understand why data values differ from one another

1.2. DEFINTION
Statistics is the scientific discipline that provides methods to help us make sense of data by:
 collecting data in a methodical way
 analysing data using methods to organise and summarise data using tables, graphs and numbers
 interpret data to draw conclusions or to answer questions

2
The field of statistics is subdivided into descriptive and inferential statistics:
 Descriptive statistics includes the collection and summarising of data to give an overview of the
information collected
 Inferential statistics is the process of making an estimate, prediction or decision about a population
based on sample data
o A population is almost always very large
o A sample is drawn and data summarised using descriptive techniques
o The results are used to make decisions about the population
o Reliability of decisions/conclusions are measured
 Confidence level: the proportion of times that an estimating procedure will be correct
in the long run
 Significance level: how frequently the conclusion will be wrong

1.3. THE LANGUAGE OF STATISTICS


 An experiment or investigation is any process of observation or measurement
 Elements are the people or objects about which information is collected
 A population is a complete collection of elements you wish to study
o If the population contains a countable number of items it is said to be finite
o If the number of items in the population is unlimited it is said to be infinite
o A study of the entire population is known as a census
o A parameter is a numerical measure that describes the population
 It is usually indicated by a letter from the Greek alphabet (e.g. , , )
 A sample is a portion of data drawn from the population and must be representative of the population
o A statistic is a numerical measure that describes a sample
 It is usually indicated by a letter from the Roman alphabet (e.g. 𝑥̅, s, p)
 A variable is a characteristic of interest about each element of a population or sample
o It varies from element to element
o The observed values of the variable are the data we will use in a statistical investigation
 Variables can be classified as quantitative or qualitative
o Qualitative or categorical variables provide information that is non-numerical
 E.g. marital status, type of job, gender, etc.
 Is sometimes coded to make it appear quantitative but will have no numerical meaning
o Quantitative variables provide numerical measurements of the elements of the study
 E.g. height, weight, age, etc.
 Arithmetic operations can be performed on the values of the variable

3
 Quantitative variables are further classified as discrete or continuous
o Discrete variables are countable
 Values are obtained through counting
 E.g. the number of students in the class
o Continuous variables have infinite number of possible values that are not countable
 Values are obtained through measuring or weighing
 E.g. weight, length, time taken to complete a task, age, etc.

Example 1.2
Distinguish between qualitative and quantitative variables.

1. Gender

2. Temperature

3. Postal code

4. Number of drinks at a party for a couple of friends

Example 1.3
Distinguish between discrete and continuous variables.

1. Number of heads obtained after flipping a coin five times

2. Number of cars that arrive at the KFC drive-through between 10h00 and 12h00

3. Distances different model cars with the same tank capacity can drive in city driving conditions

4. Temperature
4
1.4. MEASUREMENT
Measurement is the process we use to assign a value to the observations or elements of a variable. There are
four levels or scales of measurement: nominal, ordinal, interval, ratio. The analyses depend on the scale used
to measure a variable.

1.4.1. Nominal scale


 This categorical level applies to data that consist of names, labels and categories in no specific order
 Numbers or symbols are used to identify groups to which various observations belong
 In counting males and females, the male group can be assigned the code 1 and the females the code 2
o These numbers serve only as a label for the group and the measurement consist of placing the
data in the correct group
 No arithmetic operations can be performed by such numbers other than counting the groups and the
number of elements falling into each group

1.4.2. Ordinal scale


 The categories into which data are grouped are ranked in some order
 Differences between data values are meaningless, e.g. income levels such as low, medium or high
 The permissible analysis methods for ordinal data include techniques generally associated with the
order of the observations

1.4.3. Interval scale


 Interval scaled data are always numerical
 Data can be arranged in order and the distances between data values are meaningful, but ratios of data
are not because of the absence of a true zero point
 E.g. temperature in degrees Celsius:
o There is a zero but it has meaning, it does not represent “nothingness”
o The increase on the centigrade scales between 10 and 20 is the same as the increase between
30 and 40
o However, heat cannot be measured in absolute terms (0ºC does not mean no heat) and it is not
possible to say that 40º are twice as hot as 20º
 Arithmetic operations can be performed on the difference between numbers, not the numbers
themselves
 The following are examples of data at the interval level of measurement:
o IQ score, scores on the Meyers-Briggs Scale, belt/shoe sizes, calendar dates, time

5
1.4.4. 1.4.4 Ratio scale
 Data can be arranged in order
 Both differences between data values and ratios of data values are meaningful
 This scale must contain a zero value that indicates that nothing exists for the variable at the zero point
 Arithmetic operations can be performed on the numeric values themselves.
 E.g. money:
o The zero point is meaningful, i.e. at zero you have none
o The difference between R10 and R20 is the same as the difference between R50 and R60
o R10 is twice as much as R5
 Arithmetic operations can be performed on the numbers themselves
 Variables such as distance, height, weight, and time use the ratio scale

Activity 1.1
Categorise these measurements according to level.
Variable/measurements Measurement scale
1. Species of fish in the Vaaldam
2. Cost of rod and reel
3. Time of return home
4. Rating area of fishing area: Poor, fair, good
5. Number of fish caught
6. Temperature of water

Activity 1.2
The student council at a university with 10000 students is interested in the proportion of students who favour
a change in the admission requirements at the university. Two hundred students are interviewed to determine
their attitude towards this proposed change. Of the 200, 64 (32%) are in favour of the change. The student
council announced that less than 35% of all students are in favour of a change.

a) What is the question to be answered in this investigation?

b) What is the variable of interest?

6
c) Classify the variable in terms of type and measurement scale

d) What is the population of interest?

e) What group of students constitute the sample in this problem?

f) What is the sample statistic?

g) What is the population parameter?

1.5. ROLE OF THE COMPUTER IN STATISTICS


 The Internet can provide access to data across continents at low costs
 Spreadsheets or statistical and mathematical software packages make such analysis readily available
to everyone.
 Use computer for:
o Large volume of input
o Repetition of projects
o Desired greater speed in processing
o Greater accuracy
o Processing complexities that require electronic help
 It can help you develop your ideas about how to organize the information by using a ‘try and refine’
approach, which can take too long to carry out manually.

You might also like