LEC1 Introduction To Data Analysis
LEC1 Introduction To Data Analysis
ANALYSIS
LECTURE 1
Course Description
This course is designed for undergraduate engineering and
engineering technology students with emphasis on problem
solving related to societal issues that engineers and
technologists are called upon to solve. It introduces
different methods of data collection and the suitability of
using a particular method for a given situation.
Course Description continuation…
The relationship of probability to statistics is also discussed,
providing students with the tools they need to understand how
"chance" plays a role in statistical analysis. Probability
distributions of random variables and their uses are also
considered, along with a discussion of linear functions of
random variables within the context of their application to data
analysis and inference. The course also includes estimation
techniques for unknown parameters; and hypothesis testing
used in making inferences from sample to population; inference
for regression parameters and build models for estimating
means and predicting future values of key variables under
study. Finally, statistically based experimental design
techniques and analysis of outcomes of experiments are
discussed with the aid of statistical software.
Lecture 1 – Introduction to Data Analysis
› Origin
– from the Latin word “Status” meaning political state
Lecture 1 - Introduction to Data Analysis
› Definition
– scientific methods for collecting, organizing, summarizing,
presenting and analyzing data as well as deriving valid
conclusions and making reasonable decisions
Lecture 1 - Introduction to Data Analysis
› Functions
– Condensation
– Comparison
– Forecasting
– Estimation
– Tests of Hypothesis
Lecture 1 - Introduction to Data Analysis
› Scope
– Commerce
– Agriculture
– Economics
– Planning
– Medicine
– Modern applications
– Education
– And in numerous other fields…
Lecture 1 - Introduction to Data Analysis
› Limitations
– Statistics is not suitable to the study of qualitative phenomenon
– Statistics does not study individuals
– Statistical laws are not exact
– Statistics table may be misused
– Statistics is only, one of the methods of studying a problem
Lecture 1 - Introduction to Data Analysis
› Population
– a complete set of all possible observations of the type which is
to be investigated
– can be finite or infinite
– information on population can be collected in two ways: census
method and sample method.
Lecture 1 - Introduction to Data Analysis
› Population: Census Method
– every element of the population is included in the investigation
– Merits:
› The data are collected from each and every item of the population
› The results are more accurate and reliable, because every item of the
universe is required
› Intensive study is possible
› The data collected may be used for various surveys, analyses etc.
– Limitations:
› It requires a large number of enumerators, and it is a costly method
› It requires more money, labor, time energy etc.
› It is not possible in some circumstances where the universe is infinite
Lecture 1 - Introduction to Data Analysis
› Sample
– portion chosen from the population, finite subset of statistical
individuals defined in a population
› sample size - number of units in a sample (e.g. selected voters)
› sampling unit - the constituents of a population which are individuals to be
sampled from the population and cannot be further subdivided for the
purpose of the sampling at a time (e.g. voter)
› sampling frame - a list identifying each sampling unit by a number (e.g. list
of voters)
Lecture 1 - Introduction to Data Analysis
› Sample
– Reasons for selecting a sample
› Complete enumerations are practically impossible when the population is
infinite
› When the results are required in a short time
› When the area of survey is wide
› When resources for survey are limited particularly in respect of money and
trained persons
› When the item or unit is destroyed under investigation
– Parameter and Statistics
› Parameter - characteristics of a population, denoted by Greek or capital
letters (N, μ, σ)
› Statistics - characteristics of a sample, denoted by lower case Roman
letters (n , xത , s)
Lecture 1 - Introduction to Data Analysis
Broad Classification of Variables
Categorical
Lecture 1 - Introduction to Data Analysis
Levels of Measurement
Level Property
Nominal Scale in place of a name, to identify
Ordinal Scale to indicate order, to rank
Interval Scale to indicate equal intervals, to add and subtract
Ratio Scale to indicate ratio, to multiply and divide
The further down the list you go, the more reliable the information is.
Lecture 1 - Introduction to Data Analysis