0% found this document useful (0 votes)
62 views8 pages

Smat3: Statistics and Robability

This document discusses key concepts in statistics and probability. It defines statistics as the field of collecting, analyzing, and interpreting data. There are two main types of statistics: descriptive statistics, which summarize data characteristics, and inferential statistics, which allow testing of hypotheses. It also outlines different types of data, variables, and numbers that are used in statistics, as well as common terminology like population, sample, parameters, and statistics. The document provides examples of how to construct a frequency distribution table and measures of central tendency for grouped and ungrouped data.

Uploaded by

Ms. Arceño
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views8 pages

Smat3: Statistics and Robability

This document discusses key concepts in statistics and probability. It defines statistics as the field of collecting, analyzing, and interpreting data. There are two main types of statistics: descriptive statistics, which summarize data characteristics, and inferential statistics, which allow testing of hypotheses. It also outlines different types of data, variables, and numbers that are used in statistics, as well as common terminology like population, sample, parameters, and statistics. The document provides examples of how to construct a frequency distribution table and measures of central tendency for grouped and ungrouped data.

Uploaded by

Ms. Arceño
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

SMAT3: STATISTICS AND ROBABILITY

Statistics – field of science and mathematics.


- Collection of data, analyze, interpret, and present.
- Collection of data inference.
2 TYPES OF STATISTICS

 Descriptive statistics – summarize the characteristics of data.


 Inferential statistics – allow to test hypothesis.
2 TYPES OF DATA

 Primary Data – data must be taken from the source.


 Secondary Data – taken from magazine, unpublished book, newspaper…
2 TYPES OF VARIABLES

 Independent Variables – puwedeng mag isa


 Dependent Variables – kung anong nakuha using the rule
------------

 Qualitative Variables – characteristics (ex. Color of the eyes)


 Quantitative Variables – numbers (ex. Group of people)
2 TYPES OF NUMBERS (QUANTITATIVE)

 Discrete – terms/ variables that can count.


 Continuous – you can’t count (ex. 0.99999999…) (infinite)
CONCEPTS IN STATISTICS:
SATISTICS TERMINOLOGY AND DEFINITIONS
OUTLINE:

 What is statistics?
 Overview of terminology and definitions
STATISTICS

 Involves defining a research question and then translating that question into a statistical statement
that can be tested using data
- study design
- data collection
- summarize the data
- analyze data
- generalize to people or population of interest
- communicate the findings to a general audience
2 CATEGORIES OF STATISTICS

 Descriptive Statistics - summarizing a sample of data using plots or numeric summaries like:
different plots, means, median, standard deviations.
 Inferential Statistics - use data or summaries to infer something about a population by using a
sample to generalize to a population that can be divided into three sorts of areas:
 Estimation – trying to estimate the average or the mean. (what is the average
salary of a CEO?)
 Hypothesis testing – involves question like (does a CEO who’s six feet or taller
earn more on average than one who’s not?)
 Prediction – might be (for particular CEO what does our model estimate their
salary would be?)
VOCABULARY

 Unit (Subject) - this are just the entities on which data is collected
 Variable - this is a recorded characteristic for the unit or for a person. Usually, a set of data or a
data table, it’s often organized with individuals or units in rows and variables in the columns.
 Population - is the group of interest for our study so that is who or what are we interested in
studying.
 Sample - is a subset of the population to study. The result of any study are only generalizable to
the studied population.
 Population Parameter – thing/quantity we’d like to know for the entire population.
 Sample Statistics – the Estimate of parameter from sample.
 Mu for the population mean and x-bar for the sample mean
 Sigma for the population standard deviation and s for the sample standard deviation
 Rho for the population correlation and R for the sample correlation

Greek letters are used to represent population or true or theoretical values.


Latin letters are used to represent sample estimates.
External Validity – refers to asking the question is the estimate we get from our sample generalizable to
one external population. Other words, to represent an external population
Internal Validity – is essentially asking the questions is our sample estimate or a sample statistic biased
and particularly is there any confounding
“Confounding” that idea will define and explore a lot more.

Sample = 5000 university students from ABC university.


Unit = A university student
Population to generalize back to = university students
(Population) Parameter

 The true (population) difference in depression rates for those who exercise vs. those who don’t.
(Sample) Statistics

 The difference in depression rates for those who exercise vs. those who don’t in the sample?
External Validity

 Is the result from our study generalizable to the general population of university students?
Internal Validity

 Is the estimate within our study biased or not?


LECTURE #2-STATISTICS 

1. Type of Sampling Techniques

a. Probability Sampling - Samples are chosen in such a way that each member of the
population has a known though not necessarily equal chance of being included in the
sample. It is also called unbiased sampling.

1. Simple Random Sampling - All members of the population have a chance of


being included in the sample.
2. Systematic Sampling - It selects every kth member of the population with
starting point determined at random.
3. Stratified Sampling - This is used when the population can be subdivided into
several smaller groups or strata.
4. Cluster Sampling - This is sometimes called area sampling. It is usually used
when the population is very very large. In this technique groups or clusters
instead of individuals are randomly selected.

                   b. Non-Probability Sampling - Each member of the population does not


have a known chance of being included in the sample. Instead, personal judgement plays
a very important role in the selection. It is also called biased sampling.

1. Quota Sampling - It is similar to the stratified sampling. The only difference is


that the selection of the members of samples in the stratified is done randomly. 
2. Purposive sampling - Involves choosing the respondents on the basis of
predetermined criteria set by the researcher.
3. Convenience Sampling or accidental sampling - The researcher uses subjects
that are readily available to form the samples.

2. Data Gathering Techniques


a. Direct or interview method
b. Indirect or questionnaire method
c. Registration method
d. Experimental method

       3. Types of Questionnaire


a. Structure Question
b. Unstructured Question

4. Data Presentation
             Three ways to present data:

a. Textual or Narrative Presentation


*Detailed information are given in textual presentation
*Narrative reports is a way to present information/data

b. Tabular Presentation
*Numerical values are presented using table
*Information are lost in tabular presentation of data
*Frequency distribution table is applicable for qualitative variables

c. Graphical Presentation
*Trends are easily seen in graphs compared to tables.
*It is good to present data using pictures or figures like pictograph
*Pie charts are used to present data as part of one whole. As they are much
better to look at.
*Line graphs are for time-series data.
*It is better to present data using graphs then tables as they are much better to
look at.

         5. Constructing Frequency distribution table (FDT) for grouped data. Consider the given data
below. Data below show the ages of 30 senior citizen respondents of Barangay Nueva that are randomly
selected. n = 30

61 64 74 80 63 73 75 64 65 68
71 63 72 76 69 70 74 68 70 65
64 62 63 68 67 69 68 66 63 64

Determine:  range = HV - LV = 80 - 61 = 19
                                 class size (i) = 3.47/4

                                * Class intervals


                                * Class length
        * Class Mark (CM/xi) =

        * Class Boundary
        * O’GIVE (< and > cum freq)
        * Relative frequency

        * Cum relative frequency


FREQUENCY DISTRIBUTION TABLE (FDT)
Class Frequency Class Class < cum > cum Relative
intervals Mark Boundary frequency frequency frequency
(ages) (%)

77 - 80 1 78.5 76.5 - 80.5 30 1 3.33

73- 76 5 74.5 72.5 - 76.5 29 6 16.67

69 - 72 6 70.5 68.5 - 72.5 24 12 20

65 - 68 8 66.5 64.5 - 68.5 18 20 26.67

61 - 64 10 62.5 60.5 - 64.5 10 30 33.33

n = 30
  
Graphical Presentation: Using Line graph and bar graph.

        6. Measure of Central Tendency (Grouped and ungrouped data)


a. Mean
b. Mode
                       c. Median

1.Quartile    2. Decile      3. Percentile


FORMULAS

You might also like