0% found this document useful (0 votes)

20 views

Sta172 Lecture Notes Updated

Uploaded by

leophilipskosisochukwu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Sta172 Lecture Notes Updated

Uploaded by

leophilipskosisochukwu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

STA172 - STATISTICAL COMPUTING I

Module 1 & Part of Module 3

NDUKA, UCHENNA C. (Ph.D)
DEPARTMENT OF STATISTICS, UNIVERSITY OF NIGERIA, NSUKKA,

NIGERIA

1/23
1.1 Overview of Data Generation

▶ Data Generation - Data generation refers to the

process of creating, collecting, and assembling data
through various methods and techniques. It involves the
systematic gathering of information to build datasets that
can be used for analysis, research, decision-making, and
other purposes. The data generation process encompasses
activities such as data collection, recording, and storage,
and it plays a fundamental role in providing the raw
material for statistical analysis, machine learning, research
studies, and other data-driven applications.

2/23
Denition
▶ The goal of data generation is to produce accurate,
reliable, and relevant data that reects the characteristics
of the phenomenon being studied or observed. This
process involves choosing appropriate data collection
methods, ensuring data quality, handling ethical
considerations, and using tools and technologies to record
and store the data securely.
▶ Data generation is a critical step in the broader data
lifecycle, and the quality of generated data signicantly
inuences the validity and reliability of subsequent
analyses and conclusions. Researchers, scientists, and
practitioners in various elds rely on well-executed data
generation processes to obtain insights, make informed
decisions, and contribute to the advancement of
knowledge in their respective domains.
3/23
Importance of DG
▶ Importance of generating data for statistical
analysis
1. Basis for Analysis: Data serves as the foundation
for statistical analyses. Without appropriate and relevant
data, statistical techniques and methods have no input to
process.
2. Informed Decision-Making: Reliable data provides
the basis for making informed decisions. Statistical
analyses help extract patterns, trends, and relationships
from the data, assisting decision-makers in understanding
the implications of various choices.
3. Research and Exploration: Researchers use data
generation to explore hypotheses, test theories, and
contribute to the body of knowledge in their respective
elds. New data helps advance understanding and may
lead to the development of new models or insights.
4/23
Importance of DG
▶ 4. Quality Assurance: Data generation is a critical
aspect of ensuring data quality. Properly collected and
documented data contributes to the reliability and validity
of statistical analyses, reducing the likelihood of biased or
inaccurate results.
5. Predictive Modeling: Statistical analyses enable the
development of predictive models. By identifying patterns
and relationships within existing data, these models can
be applied to make predictions or forecasts in new
situations.
6.Performance Evaluation: In various elds, data
generation and subsequent statistical analysis are used to
evaluate the performance of systems, processes, or
interventions. This evaluation is essential for making
improvements and optimizing outcomes.
5/23
1.2 Table of Random Numbers

▶ Random numbers: Random numbers are a sequence of

numbers that lack any pattern, predictability, or order.
True randomness is often associated with natural
phenomena, such as atmospheric noise or radioactive
decay, but in computer science and statistics, random
numbers are typically generated using algorithms. These
algorithms, known as random number generators (RNGs),
produce sequences of numbers that mimic randomness,
although they are ultimately deterministic.

6/23
▶ A table of random numbers is a structured
arrangement of numbers devoid of any discernible pattern
or order.
▶ These tables are frequently employed in statistical
sampling, simulations, and other applications where a
source of unpredictability or randomness is necessary.
▶ Random numbers tables are particularly useful in
designing experiments, conducting surveys, and
implementing simulations that require a random and
unbiased selection process.
▶ The primary purpose of a table of random numbers is to
oer a systematic and impartial way of selecting values
for experimentation or analysis.

7/23
Example of RNT

Table: Table of Random Numbers

0.28999 0.07640 0.05954

0.67818 0.51647 0.82783
▶ 0.88404 0.56848 0.67874
0.51184 0.61581 0.43623
0.52828 0.91765 0.36030
▶ In this example, each cell of the table contains a random
number between 0 and 1. The numbers are generated
using a pseudorandom number generator.

8/23
Methods of using table of random numbers for
data generation
▶ Random Sampling: Use the table to randomly select
elements from a population. Assign each element a
unique identier, and use the random numbers to pick
samples without bias. This is useful in survey sampling or
experimental design.
▶ Random Assignment: For experimental studies, use
the table to randomly assign subjects to dierent
experimental conditions. This ensures that each
participant has an equal chance of being assigned to any
specic group.

9/23
1.3 Practical Exercise
▶ Consider the following set of data on sales in millions of
Naira

Table: (a) Data on sales (in millions)

10 11 9 10 12 7 10 9
8 10 9 9 12 8 11 10
10 9 8 9 12 10 8 10
10 10 11 11 10 11 11 8
9 10 8 9 10 10 9 11
9 10 10 11 8 10 12 11
11 8 9 8 11 9 9 9
11 11 11 10 12 9 10 11
9 9 12 8 10 10 11 9
9 12 10 9 9 9 9 10
10/23
Table: (b) Data on sales (in millions)

10 9 11 9 9 9 11 9
10 7 11 11 10 10 11 8
10 9 9 10 12 11 10 10
8 9 8 14 11 12 10 11
12 8 14 10 9 10 10 9
8 10 10 8 8 9 12 12
11 11 8 10 12 9 9 11
9 10 10 8 10 11 10 10
12 9 10 10 10 10 11 8
11 11 10 11 10 8 9 11
9 10 12 11 10 12 11 11
12 11 10 11 8 10 11 12

11/23
▶ Using Table (a) do the following:
1. give each entry in the table a number (1 - 80);
2. set your sample size n = 32;
3. use Ran# function in your calculator, select 32
observations from the sales data (a) without repetition.
4. use the necessary functions in your calculator obtain
the mean and standard deviation of the selected
observations.
5. repeat (3) and (4) 10 times to obtain 10 sets of
averages and 10 sets of standard deviations.
6. use the necessary functions in your calculator obtain
the mean and standard deviation of all the observations in
(a).
7. compare the mean and standard deviations with the
ones obtained from your samples.

12/23
▶ Using Table (b) do the following:
1. give each entry in the table a number (1 - 96);
2. set your sample size n = 50;
3. use RanInt function in your calculator, select 50
observations from the sales data (a) without repetition.
4. use the necessary functions in your calculator obtain
the mean and standard deviation of the selected
observations.
5. repeat (3) and (4) 10 times to obtain 10 sets of
averages and 10 sets of standard deviations.
6. use the necessary functions in your calculator obtain
the mean and standard deviation of all the observations in
(b).
7. compare the mean and standard deviations with the
ones obtained from your samples.

13/23
3.1 Introduction to Statistical Calculations
Statistical calculations form the backbone of quantitative
analysis, providing essential tools for interpreting and making
sense of data. In various elds such as economics, nance,
biology, and social sciences, statistical methods are employed
to uncover patterns, trends, and relationships within data sets.
This introduction aims to shed light on the fundamental
concepts and purposes that underpin statistical calculations.
Purpose of Statistical Computations - Statistical
computations serve several crucial purposes in the realm of
data analysis:
▶ Summarization: Statistical calculations help summarize
large and complex data sets into meaningful measures,
providing a concise overview of essential characteristics.

14/23
▶ Description: They facilitate the description of data by
oering insights into its central tendencies, variability, and
distribution.
▶ Inference: Statistical computations enable the drawing
of inferences and conclusions about populations based on
sample data, supporting decision-making processes.
▶ Prediction: Through regression analysis and other
predictive models, statistical calculations empower
analysts to make informed predictions about future trends
or outcomes.
Role of Calculator in Statistical Analysis - The use of
calculators in statistical analysis has become indispensable due
to the following reasons:
▶ Eciency: Calculators streamline complex mathematical
operations, making statistical calculations more ecient
and less prone to human error.
15/23
▶ Accessibility: Modern calculators come equipped with
built-in statistical functions, providing easy access to
measures of central tendency, dispersion, and other
statistical parameters.
▶ Complexity Handling: In scenarios involving large data
sets or intricate computations, calculators facilitate the
handling of complex statistical procedures with speed and
accuracy.
▶ Real-world Application: The integration of statistical
functions into calculators allows for immediate application
in various real-world situations, from business and nance
to scientic research.

16/23
3.2 Measures of Central Tendency and Dispersion
In statistical analysis, measures of central tendency and
dispersion provide valuable insights into the characteristics and
distribution of a data set. They help summarize the data and
understand its variability, aiding in making informed decisions
and drawing meaningful conclusions. Let's delve into each
concept:
Measures of Central Tendency - Measures of central
tendency represent the central or typical value around which
data points tend to cluster. They are primarily:
▶ Mean (Average): The mean is the sum of all data values
divided by the total number of observations. It is highly
sensitive to outliers and extreme values.
▶ Median: The median is the middle value of a data set
when arranged in ascending or descending order. It is less
aected by outliers and is often preferred for skewed
distributions. 17/23
▶ Mode: The mode is the value that appears most
frequently in a data set. A data set can have one mode
(unimodal), multiple modes (multimodal), or no mode.
Measures of Dispersion - Measures of dispersion quantify
the spread or variability of data points around the central
tendency. They provide insights into the consistency and
variability within the data set. The main measures of
dispersion include:
▶ Range: The range is the dierence between the maximum
and minimum values in a data set. It is simple to
calculate but sensitive to outliers.
▶ Variance: The variance measures the average squared
deviation of each data point from the mean. It provides a
more comprehensive understanding of data dispersion but
is not in the original units of the data.
18/23
▶ Standard Deviation: The standard deviation is the square
root of the variance. It represents the average distance of
data points from the mean and is widely used due to its
intuitive interpretation.
Step-by-step calculator computations for mean,
median, mode, range, variance, and standard
deviation
▶ Hands-on using the following data set: 45, 43, 42, 80, 84,
82, 56, 59, 52, 71, 72, 76, 67, 62, 65, 23, 26, 27, 34, 35,
37, 48, 49, 50, 53
▶ Using your scientic calculator, obtain the mean, median,
mode, standard deviation, variance, and range.

19/23
3.3 Time Series Analysis
Time series data is a type of sequential data where
observations are recorded at regular intervals over time. It is a
fundamental component of many elds, including economics,
nance, weather forecasting, and social sciences. Time series
analysis involves studying the patterns, trends, and behaviors
exhibited by the data over time. Here's an overview of key
aspects of time series data:
▶ Temporal Structure: Time series data is organized
chronologically, with observations recorded at successive
time points. The time intervals between observations are
usually regular (e.g., daily, monthly, yearly)
▶ Components of Time Series: Trend, Seasonal
variations, Cyclical variations, and Irregular variations.

20/23
Time Series Analysis Techniques:
▶ Descriptive Analysis: Examining the basic
characteristics of the time series data, such as mean,
median, variance, and standard deviation.
▶ Trend Analysis: Identifying and modeling the underlying
trend in the data to understand long-term behavior.
Linear trend: y = a + a t
Quadratic trend: y = a + a t + a t
t 0 1
2

Logarithmic trend: y = a + a ln(t)

t 0 1 2

Exponential trend: y = a e
t 0 1
b1 t
t 0

t 1 2 3 4 5 6
y
▶ Hands-on t
450 501 523 550 570 601
t 7 8 9 10 11 12
yt 624 700 758 805 809 801

21/23
Practical Exercise
A) Consider the following data on monthly Australian beer
production: 164, 148, 152, 144, 155, 125, 153, 146, 138, 190,
192, 192, 147, 133, 163, 150, 129, 131, 145, 137, 138, 168,
176, 188, 139, 143, 150, 154, 137, 129, 128, 140, 143, 151,
177, 184, 151, 134, 164, 126, 131, 125, 127, 143, 143, 160,
190, 182, 138, 136, 152, 127, 151, 130, 119, 153
Obtain the linear, quadratic, Logarithmic, and exponential
trend equations.
B) The following data are on sales of Shampoo over three year
period: 266, 145.9, 183.1, 119.3, 180.3, 168.5, 231.8, 224.5,
192.8, 122.9, 336.5, 185.9, 194.3, 149.5, 210.1, 273.3, 191.4,
287, 226, 303.6, 289.9, 421.6, 264.5, 342.3, 339.7, 440.4,
315.9, 439.3, 401.3, 437.4, 575.5, 407.6, 682, 475.3, 581.3,
646.9
Obtain the linear, quadratic, Logarithmic, and exponential
trend equations.
22/23
▶ Embrace the journey, discover your potential !

23/23

Behavioral Statistics in Action
100% (1)
Behavioral Statistics in Action
4 pages
Sta172 Lecture Notes
No ratings yet
Sta172 Lecture Notes
11 pages
Statistics Handbook for Data Analysts _ by Anita Gupta _ Medium
No ratings yet
Statistics Handbook for Data Analysts _ by Anita Gupta _ Medium
17 pages
Statistics Probabilty
No ratings yet
Statistics Probabilty
92 pages
Role of Statistics in Engineering - OMPAD
No ratings yet
Role of Statistics in Engineering - OMPAD
15 pages
Statistics Book
No ratings yet
Statistics Book
170 pages
1.1 CS3352-FDS -UNIT 1
No ratings yet
1.1 CS3352-FDS -UNIT 1
42 pages
DBB2102 – QUANTITATIVE TECHNIQUES FOR MANAGEMENT
No ratings yet
DBB2102 – QUANTITATIVE TECHNIQUES FOR MANAGEMENT
15 pages
Module3
No ratings yet
Module3
54 pages
Institute of Management Technology: PGDM, PDGM (Finance) & PDGM (Marketing) Term - I, AY 2019-2020 Course Handout
No ratings yet
Institute of Management Technology: PGDM, PDGM (Finance) & PDGM (Marketing) Term - I, AY 2019-2020 Course Handout
8 pages
DV - Unit 1
No ratings yet
DV - Unit 1
40 pages
Descriptive Statistics Is Useful in Many Different Jobs
No ratings yet
Descriptive Statistics Is Useful in Many Different Jobs
3 pages
Business Statistics Course Outline
100% (1)
Business Statistics Course Outline
5 pages
DBB2102 QUANTITATIVE TECHNIQUES FOR MANAGEMENT (18)
No ratings yet
DBB2102 QUANTITATIVE TECHNIQUES FOR MANAGEMENT (18)
12 pages
Sta 222 - New (1) - 1-1
No ratings yet
Sta 222 - New (1) - 1-1
25 pages
Module 3 Introduction To Statistics
No ratings yet
Module 3 Introduction To Statistics
9 pages
Assignment_DBB2102_BBA 3_Set-1 and 2_July-Aug_2024
No ratings yet
Assignment_DBB2102_BBA 3_Set-1 and 2_July-Aug_2024
18 pages
Data Science
No ratings yet
Data Science
62 pages
DA notes
No ratings yet
DA notes
15 pages
QTM Cycle 7 Session 1
No ratings yet
QTM Cycle 7 Session 1
86 pages
Assignments-Aurr (1) Priyabs-11
No ratings yet
Assignments-Aurr (1) Priyabs-11
4 pages
DBBA2102
No ratings yet
DBBA2102
10 pages
An Introduction To Business Statistics 4
No ratings yet
An Introduction To Business Statistics 4
12 pages
Book 105 - CCP - Notes
No ratings yet
Book 105 - CCP - Notes
174 pages
Lecture Note (Basic Statistics Acc & Fina)
No ratings yet
Lecture Note (Basic Statistics Acc & Fina)
187 pages
Untitled document
No ratings yet
Untitled document
10 pages
Unit 2 - Data Science & Big Data - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 2 - Data Science & Big Data - WWW - Rgpvnotes.in PDF
17 pages
Notes
No ratings yet
Notes
5 pages
Chapter 1 Introduction To Statistics and Analysis
No ratings yet
Chapter 1 Introduction To Statistics and Analysis
6 pages
Chap 1 Introduction To Statistics
No ratings yet
Chap 1 Introduction To Statistics
6 pages
Statistical Computing
No ratings yet
Statistical Computing
4 pages
What Exactly Is Data Science
No ratings yet
What Exactly Is Data Science
15 pages
BUSINESS STATISTICS NOTES COEC 1210
No ratings yet
BUSINESS STATISTICS NOTES COEC 1210
44 pages
Statistic Note
No ratings yet
Statistic Note
19 pages
Statistics Notes
No ratings yet
Statistics Notes
28 pages
DS Unit 2
No ratings yet
DS Unit 2
50 pages
PYQ ELEMENTS OF STATSTICS
No ratings yet
PYQ ELEMENTS OF STATSTICS
16 pages
Chapter 1
No ratings yet
Chapter 1
60 pages
MEL761: Statistics For Decision Making: About The Course
No ratings yet
MEL761: Statistics For Decision Making: About The Course
65 pages
ML-UNIT1
No ratings yet
ML-UNIT1
15 pages
Data Science 2
No ratings yet
Data Science 2
8 pages
Basic Stat - Chapter 1 Introduction To Statistics
50% (2)
Basic Stat - Chapter 1 Introduction To Statistics
25 pages
Statistics
No ratings yet
Statistics
14 pages
Role of Statistics in Data Science
No ratings yet
Role of Statistics in Data Science
8 pages
4485-1
No ratings yet
4485-1
33 pages
7 Inference L8 Unlocked
No ratings yet
7 Inference L8 Unlocked
29 pages
EDA Module 1
No ratings yet
EDA Module 1
2 pages
Module Book-Business Statistics
No ratings yet
Module Book-Business Statistics
210 pages
Unit Ii-Ds
No ratings yet
Unit Ii-Ds
12 pages
Week 1 Course Material
No ratings yet
Week 1 Course Material
15 pages
QT Module-2
No ratings yet
QT Module-2
45 pages
Statistics
No ratings yet
Statistics
13 pages
bda chapter 1
No ratings yet
bda chapter 1
8 pages
Final Correction Basic Statistics Combined Chapter (1)
No ratings yet
Final Correction Basic Statistics Combined Chapter (1)
130 pages
E-Note_33325_Content_Document_20250319114322AM
No ratings yet
E-Note_33325_Content_Document_20250319114322AM
69 pages
Educational Statistics EDU 408.doc ready
No ratings yet
Educational Statistics EDU 408.doc ready
41 pages
The Importance of Statistics in Data Science
No ratings yet
The Importance of Statistics in Data Science
2 pages
Business Statistics - Session 1 - 3
No ratings yet
Business Statistics - Session 1 - 3
63 pages
Quantitative Methods For Management: Term II 4 Credits MGT 408
No ratings yet
Quantitative Methods For Management: Term II 4 Credits MGT 408
75 pages
Presentation On Data Analysis: Submitted by
No ratings yet
Presentation On Data Analysis: Submitted by
38 pages
Business Analytics: A Comprehensive Guide
From Everand
Business Analytics: A Comprehensive Guide
Naila Hina
No ratings yet
Types of Sampling:: 1. Probability Sampling. 2. Non-Probability Sampling
No ratings yet
Types of Sampling:: 1. Probability Sampling. 2. Non-Probability Sampling
3 pages
ST227 2024-25 Assignment 4-2
No ratings yet
ST227 2024-25 Assignment 4-2
2 pages
IGNOU MBA Note On Statistics For Management
100% (2)
IGNOU MBA Note On Statistics For Management
23 pages
Gec410 Note V
No ratings yet
Gec410 Note V
25 pages
PLG - PMC 500 KULIAH 1 - 10sept2019 - V2
No ratings yet
PLG - PMC 500 KULIAH 1 - 10sept2019 - V2
24 pages
CH3 - Question Bank 1 - Que
100% (1)
CH3 - Question Bank 1 - Que
4 pages
Week 10 Assignment Ch14
No ratings yet
Week 10 Assignment Ch14
16 pages
UEQ Data Analysis Tool Version12
100% (1)
UEQ Data Analysis Tool Version12
244 pages
Rayleigh and Ricean Distributions - National Instruments
No ratings yet
Rayleigh and Ricean Distributions - National Instruments
3 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
68 pages
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
No ratings yet
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
11 pages
Lecture 24 PDF
No ratings yet
Lecture 24 PDF
12 pages
Logistic Regression
No ratings yet
Logistic Regression
11 pages
Ch.7 Demand Forecasting
No ratings yet
Ch.7 Demand Forecasting
2 pages
Bahan Ajar Minggu 13 Simsis
No ratings yet
Bahan Ajar Minggu 13 Simsis
9 pages
Discriminative Vs Generative Algorithms
No ratings yet
Discriminative Vs Generative Algorithms
3 pages
Ps 4
No ratings yet
Ps 4
12 pages
Instant Download Data Analysis and Graphics Using R An Example based Approach 2nd Edition John Maindonald PDF All Chapters
No ratings yet
Instant Download Data Analysis and Graphics Using R An Example based Approach 2nd Edition John Maindonald PDF All Chapters
81 pages
L5 L6 - Conditional Probability
No ratings yet
L5 L6 - Conditional Probability
22 pages
PR2 Q2 Module 3
No ratings yet
PR2 Q2 Module 3
9 pages
Ergodic Processes: 16.1 Ergodicity and Mixing
No ratings yet
Ergodic Processes: 16.1 Ergodicity and Mixing
2 pages
Statistics 116 - Fall 2004 Theory of Probability Assignment # 7 Solutions
No ratings yet
Statistics 116 - Fall 2004 Theory of Probability Assignment # 7 Solutions
6 pages
Module 11
No ratings yet
Module 11
4 pages
MCQ - MSA Test 1 2025 with key (1)
No ratings yet
MCQ - MSA Test 1 2025 with key (1)
6 pages
transportation lab 1
No ratings yet
transportation lab 1
6 pages
ECS4863 - Solutions To Activity 1.3
No ratings yet
ECS4863 - Solutions To Activity 1.3
16 pages
Wiley Chapter 4
No ratings yet
Wiley Chapter 4
4 pages
Final Exam
No ratings yet
Final Exam
18 pages

Sta172 Lecture Notes Updated

Uploaded by

Sta172 Lecture Notes Updated

Uploaded by

STA172 - STATISTICAL COMPUTING I

Module 1 & Part of Module 3

▶ Data Generation - Data generation refers to the

▶ Random numbers: Random numbers are a sequence of

Table: Table of Random Numbers

0.28999 0.07640 0.05954

Table: (a) Data on sales (in millions)

Logarithmic trend: y = a + a ln(t)

You might also like