0% found this document useful (0 votes)
27 views10 pages

Lect. One

This document provides an overview of engineering statistics and probability. It discusses three main branches: descriptive statistics, inferential statistics, and probability. Descriptive statistics involves organizing and summarizing data, while inferential statistics draws conclusions from samples to populations. There are also four levels of data measurement and five types of sampling discussed. The document concludes with an overview of the typical statistical process, which involves collecting data, presenting it, analyzing it mathematically, and interpreting results.

Uploaded by

hkaqlq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views10 pages

Lect. One

This document provides an overview of engineering statistics and probability. It discusses three main branches: descriptive statistics, inferential statistics, and probability. Descriptive statistics involves organizing and summarizing data, while inferential statistics draws conclusions from samples to populations. There are also four levels of data measurement and five types of sampling discussed. The document concludes with an overview of the typical statistical process, which involves collecting data, presenting it, analyzing it mathematically, and interpreting results.

Uploaded by

hkaqlq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Basrah University for Oil and Gas College of Oil and Gas

Engineering//Department of Oil and Gas Engineering

Miss. Rasha
2018-2019

Subject: Engineering Statistics // lecture one


Introduction

Statistics: is the science of data. Collection of methods for planning experiments,


obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting,
and drawing conclusions.
This science allows us to quantify uncertainty in order to assist us in making
meaningful predictions and decision.
Statistics and probability ‫ واالحتمال االحصاء‬are anywhere from the just basic arithmetic
statistics to the most applicable area mathematic, for example it is in economics,
business, natural and social sciences, and astronomy.
There are three branches of statistics and probability:
➢ Descriptive statistics. ‫الوصفية االحصائيات‬
➢ Inferential statistics. ‫االستنتاجية االحصائيات‬
➢ Probability. ‫االحتمال‬
Descriptive statistics:
Is used to make sense of data via collection, organization, summarization, and
presentation of these data to find the important information from the data set
and eliminates undesired information to avoid information overload, reach
conclusions and make decisions
Inferential statistics:
Is used to draw conclusions about a population given information about a
representative sample. That is mean generalizing from samples to populations.
Performing hypothesis testing, determining relationships between variables,
and making predictions.

1
Types of Data:
1. Qualitative Variables (Data): variables (data) which assume non-numerical values.
2. Quantitative Variables (Data): variables (data) which assume numerical values.
There are two types of quantitative data:
a) Discrete Variables (Data): are usually obtained by counting. There are a finite or
countable number of choices available with discrete data. For example: you can't have
2.73 people in the room.
b) Continuous Variables (Data): which assume an infinite number of possible values
Usually obtained by measurement. Length, weight, and time are all examples of
continuous variables. Since continuous variables are real numbers, we usually round
them. This implies a boundary depending on the number of decimal places. For
example: four is really anything 3.5 <= x < 4.5. Likewise, if there are two decimal
places, then 3.03 is really anything 3.025 <= x < 3.035.

Levels of Data:
There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These
go from lowest level to highest level. Data is classified according to the highest
level which it fits. Each additional level adds something the previous level did not
have.
1. Nominal Level (Categorical Data):
Level of measurement which classifies data into mutually exclusive, all inclusive
categories in which no order or ranking can be imposed on the data. Nominal is the
lowest level, only names are meaningful here. For example, color, manufacturer.

2
2. Ordinal Level:
Level of measurement which classifies data into categories that can be ranked.
Differences between the ranks do not exist. Ordinal adds an order to the data, For
example, sizes.
3. Interval Level:
Level of measurement which classifies data that can be ranked and differences are
meaningful. However, there is no starting point (zero), so ratios are
meaningless. This level limited to dates and temperatures
4. Ratio Level:
Level of measurement which classifies data that can be ranked, differences are
meaningful, and there is a true zero. True ratios exist between the different units of
measure, For example counts, weight, height, etc.

Definitions
Population
All subjects possessing a common characteristic that is being studied. For example, all
engineering students.
Census
The collection of data from every element in a population. For example, record the
height for each student in the engineering college.
Sample
A subgroup or subset of the population that is measured. For example, the set of
students in a class in the college.
Parameter
Characteristic or measure obtained from a population.
Statistic (not to be confused with Statistics)
A numerical description of some property of the sample characteristic. For example,
the mean (average) height of the students in a class in the college.

3
Types of Sampling

There are five types of sampling: Random, Systematic, Convenience, Cluster, and
Stratified

1) Random sampling is analogous to putting everyone's name into a hat and drawing
out several names. Each element in the population has an equal chance of occurring.
While this is the preferred way of sampling, it is often difficult to do. It requires that a
complete list of every element in the population be obtained. Computer generated lists
are often used with random sampling.

2) Systematic sampling is easier to do than random sampling. In systematic sampling,


the list of elements is "counted off".

3) Convenience sampling is very easy to do, but it's probably the worst technique to
use. In convenience sampling, readily available data is used. That is, the first people
the surveyor runs into.

4) Cluster sampling is accomplished by dividing the population into groups -usually


geographically-. These groups are called clusters or blocks. The clusters are randomly
selected, and each element in the selected clusters are used.

5) Stratified sampling also divides the population into groups called strata. However,
this time it is by some characteristic, not geographically. For instance, the population
might be separated into males and females. A sample is taken from each of these
strata using either random, systematic, or convenience sampling.

4
Statistical Processes:

Any statistical process will follow the these steps:

1. Collection of data

2. Presentation of data

3. Study of data mathematically by applying statistical methods

4. Interpenetration and conclusions of data

1- Collection of data : The process of data collection is directly related to sampling


and is best viewed as complementary to it. Data, therefore, are collected directly
from the identified and selected sample population. Direct data include recordable
spoken or written words and also observable body-language, actions and
interactions indirect data are generated, in the first instance, by someone or
something else, such as with documents or photographs reporting an event or an
artistic rendition of an event or experience (e.g. novels, songs, paintings poems,
photographs).

2- Presentation of data

The data when collected should be presented in an intelligible form. Usually for
data that large in numbers a frequency table is created with first column showing
the variety and second column gives the frequency. Frequency is the number of
times each variety is repeated.

5
Example: Data set: 3, 7, 4, 0, 2, 9, 7, 5, 6, 5, 8, 7, 4, 3, 4, 5, 0, 1, 1, 3, 4, 7, 6, 8 and
7.

 If the population is large, the variants are grouped in classes, usually of


equal intervals. This done by obtain the:
 range (r) = Max value – Min value, among data and number of classes is
m=1+3.3 log(N) (Sturge’s method) or using the square-root method
m=√N where N is the number of data set.

Example: Represent the following 90 observations in a frequency table with a


suitable class interval.

6
Solution:
The smallest number is: 12

The biggest number is: 53

The range (r)=53-12=41

Number of classes m= 1+3.3 *log 90 =7.44 ≈7

or m=square-root of 90 =9.48 ≈9

We choose m=7 for this example

The width of each class is equal to r/m = 41/7 =5.85 ≈6

Class Frequency

Statistical data representation

Statistical data: is a sequence of observations, made on a set of objects included in


the sample drawn from population.

Raw Data: data collected in original form - it means the data that have not been
organized numerically

Example 1: The final grades in mathematics of 30 students at ET College are


recorded in the accompanying table.

68 84 75 82 68 90 62 88 76 93

73 79 88 73 60 93 71 59 85 57

61 65 75 87 74 62 95 78 63 72

7
Depending on the data in the table above, find:

(a) The highest grade.

(b) The lowest grade.

(c) The range.

(d) The grades of the three lowest-ranking students.

(f) The grade of the student ranking tenth highest.

(g) The number of students who received grades of 85 or higher.

(h) The percentage of students who received grades higher than 65 but not higher
than 85

Solution:
*First step is subdividing the data into appropriate classes, such as in table 2.1, to
make the answer easier:

*The second step is constructing an array by arranging the numbers of each class
into an array, such as in table 2.2

8
* From the table 2.2 it is relatively easy to answer the above questions:
(a) The highest grade is (95).
(b) The lowest grade is (57).
(c) The range is 95 – 57 = 38 [Range = highest value - lowest value]
(d) The three lowest-ranking students have grades 57, 59, and 60.
(e) The grade of the student ranking tenth highest is 82.
(f) The number of students receiving grades of 85 or higher is 8.
(g) The percentage of students who received grades higher than 65 but not higher
than 85 is 15/30 = 50%.

You might also like