0% found this document useful (0 votes)

37 views4 pages

PFC - Workshop 08: Statistics

This document discusses statistical analysis techniques for summarizing and understanding data. It describes calculating the mean, standard deviation, and performing linear regression on data sets. It provides examples of programs that would read data from files to calculate these statistics and determine the slope, y-intercept, and correlation coefficient from a linear regression analysis.

Uploaded by

Nhân Trung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views4 pages

PFC - Workshop 08: Statistics

Uploaded by

Nhân Trung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 4

PFC - Workshop 08

Statistics

In this workshop, you are to work with data values stored in a file.

Learning Outcome

Upon successful completion of this workshop, you will have demonstrated the ability to read sequential
text files.

Introduction

Statistics helps summarize data in interesting and meaningful ways. The data may consist of values
from a variety of sources. Statistics identifies what is common and the degree of variation about what
is common. In cases where a data set has several aspects, statistics determines the extent to which one
aspect of the set is related to another aspect of the set.

For example, consider a video store with a large selection of rental videos at various prices. We
determine the average price of a video in the store, the degree of price variation about this average and
then infer something about the clientele who rent videos from that store.

If we suspect that the price of each video was determined by the date of its original release, we use
statistics to determine numerically if there is indeed any correlation between the rental price and the
original release date. Moreover, we determine the degree of any such correlation.

Statistics Calculator

Design and code a program that calculates statistical measures for a set of data values stored in a
file. The program prompts for and accepts the name of the data file and reads each record in the file,
calculating the mean and the standard deviation of the data values. The data file contains one floating
point value per record. Each record is delimited by a newline.

For a sample containing n values, the key statistical indicators include

1. Mean

m = ( x1 + x2 + x3 + ... + xn ) / n (mx = )
2. Sum of the squares of the values

ss = x12 + x22 + x32 + ... + xn2

3. Variance (d2 )

( = = ) = ( ss / n ) - m2

4. Standard deviation (d)

= sqrt( ( s / n ) - m2 )

The program output might look something like:

Statistics Calculator
=====================
Enter the name of the data file : sample_1.dat

The number of data values read from this file was 39
Their statistical mean is 8.08
Their standard deviation is 2.15

Regression Analysis Calculator

Design and code a program that prompts for and accepts the name of a data file and then reads data
value pairs from the file, while performing a linear regression analysis on the set of data value pairs.

The data file contains two values per record, with the two fields delimited by whitespace and each
record delimited by a newline.

Linear regression analysis models the statistical relationship between two sets of data values.
Typically, we use the method of least squares to determine the line that best fits the scatter of data
points on an x-y graph. The formulas for the line are:

Linear equation:

y=a*x+b

5. slope (a)

a = [ (x1 - mx)(y1 - my) + (x2 - mx)(y2 - my) +

... + (xn - mx)(yn - my)] /
[ (x1 - mx)2 + (x2 - mx)2 + ... + (xn - mx)2]
(a = )

6. y-intercept (b)

b = my - a mx

where m denotes the mean, which is given by

mx = ( x1 + x2 + x3 + ... + xn ) / n
my = ( y1 + y2 + y3 + ... + yn ) / n

The statistical measure that indicates how well our best fit line models the relationship between the
variables is called the correlation coefficient.

7. Pearson Correlation Coefficient given by the formula

r = { [ (x1y1) + (x2y2) + ... + (xnyn) ] / n - mx my} / [ dx dy ]

(r = )

where d denotes the standard deviation, which is given by

(dx)2 = ( sx / n ) - mx2
(dy)2 = ( sy / n ) - my2

where s denotes the sum of the squares, which is given by

sx = x12 + x22 + x32 + ... + xn2
sy = y12 + y22 + y32 + ... + yn2
Your program output might look something like:

Regression Analysis
===================
Enter the name of the data file : sample_2.dat

The slope of the least squares fit is 0.26

The y-intercept of the least squares fit is 26.94
The correlation coefficient is 0.91

What is the age of your car in months ? 32

You can expect a stopping distance of 35.13 metres

The data in the file "sample_2.dat" represents the minimum stopping distance in metres of cars aged
between 9 months and 76 months travelling at 40kph.

UVvisible Spectroscopy-Forensic Application
No ratings yet
UVvisible Spectroscopy-Forensic Application
10 pages
Prof. Dr.-Ing. Günther Clauss, Prof.-Dr.-Ing. Eike Lehmann, Dr.-Ing. Carsten Östergaard (Auth.) - Offshore Structures - Volume I - Conceptual Design and Hydromechanics-Springer-Verlag London (1992)
100% (1)
Prof. Dr.-Ing. Günther Clauss, Prof.-Dr.-Ing. Eike Lehmann, Dr.-Ing. Carsten Östergaard (Auth.) - Offshore Structures - Volume I - Conceptual Design and Hydromechanics-Springer-Verlag London (1992)
351 pages
Unseen Passage For Class 7 With Questions
100% (1)
Unseen Passage For Class 7 With Questions
8 pages
Statistics Learners' Working Manual
No ratings yet
Statistics Learners' Working Manual
25 pages
Aace Evm 82R 13
No ratings yet
Aace Evm 82R 13
22 pages
Previewpdf
No ratings yet
Previewpdf
13 pages
Coconut Husk RRS
No ratings yet
Coconut Husk RRS
6 pages
Biology 13th Edition Raven Full Download
100% (2)
Biology 13th Edition Raven Full Download
408 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Q2-English Dlp-Week 7
No ratings yet
Q2-English Dlp-Week 7
16 pages
Form 4 Term 2 Geography Schemes @0743505350
No ratings yet
Form 4 Term 2 Geography Schemes @0743505350
9 pages
Chapter 4 Regression (2) - Unlocked
No ratings yet
Chapter 4 Regression (2) - Unlocked
97 pages
Stats - The Theory 2
No ratings yet
Stats - The Theory 2
25 pages
UE20CS312 Unit2 Slides
No ratings yet
UE20CS312 Unit2 Slides
206 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
BCSE352E EDA CAT 2 Mod 1,2,5 PDF
No ratings yet
BCSE352E EDA CAT 2 Mod 1,2,5 PDF
146 pages
Correlation Regression 15 16
No ratings yet
Correlation Regression 15 16
19 pages
Advancedeconometricsl3!4!240128102442 58a0f1f1
No ratings yet
Advancedeconometricsl3!4!240128102442 58a0f1f1
58 pages
AL-UoS BABSFY S3 Numeracy and Data Analysis Assignment April - June 2020 For Sep 2019 - Intake
50% (2)
AL-UoS BABSFY S3 Numeracy and Data Analysis Assignment April - June 2020 For Sep 2019 - Intake
15 pages
BCSE352E EDA CAT 2 Mod 1,2,5
No ratings yet
BCSE352E EDA CAT 2 Mod 1,2,5
146 pages
4 Regression Analysis
No ratings yet
4 Regression Analysis
44 pages
(tailieudieuky.com) Đề thi tuyển sinh vào lớp 10 THPT, THPT Chuyên Lương Văn Tụy, Ninh Bình năm học 2022-2023 môn Tiếng Anh (chuyên) có đáp án
No ratings yet
(tailieudieuky.com) Đề thi tuyển sinh vào lớp 10 THPT, THPT Chuyên Lương Văn Tụy, Ninh Bình năm học 2022-2023 môn Tiếng Anh (chuyên) có đáp án
10 pages
Tian Statistics Lesson 3 Descriptive Statistics
No ratings yet
Tian Statistics Lesson 3 Descriptive Statistics
64 pages
Parametric Test
No ratings yet
Parametric Test
49 pages
Cannonball-Geotech Report-Final - 10-04-2022
No ratings yet
Cannonball-Geotech Report-Final - 10-04-2022
64 pages
SPSS Data Analysis
100% (6)
SPSS Data Analysis
47 pages
Notes 1017 Part1
No ratings yet
Notes 1017 Part1
50 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
45 pages
Syntekoclassic Eu en Msds
No ratings yet
Syntekoclassic Eu en Msds
28 pages
Lecture 9 Simple-Linear-Regression-Correlation Updated
No ratings yet
Lecture 9 Simple-Linear-Regression-Correlation Updated
44 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
12 pages
Calculating Spread and Checking Intake Location: Office of Design
No ratings yet
Calculating Spread and Checking Intake Location: Office of Design
7 pages
MMW Chapter 4
No ratings yet
MMW Chapter 4
11 pages
Quantitative Anaysise Solomon
No ratings yet
Quantitative Anaysise Solomon
51 pages
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
No ratings yet
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
10 pages
Melting Pot Theory
No ratings yet
Melting Pot Theory
4 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Correlation and Regression
No ratings yet
Correlation and Regression
31 pages
Formal Concept Analysis Mathematical Foundations Second Edition 2nd Edition Bernhard Ganter PDF Download
No ratings yet
Formal Concept Analysis Mathematical Foundations Second Edition 2nd Edition Bernhard Ganter PDF Download
85 pages
Regression Analysis
No ratings yet
Regression Analysis
47 pages
Introduction To Statistics (4485) : Semester: Spring, 2023
No ratings yet
Introduction To Statistics (4485) : Semester: Spring, 2023
26 pages
Section 2
No ratings yet
Section 2
22 pages
Deck2 BusinessIntelligence M1 ACSA
No ratings yet
Deck2 BusinessIntelligence M1 ACSA
15 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Meweek 3
No ratings yet
Meweek 3
57 pages
Sta404 - Chapter 5 - Bivariate Analysis (Student)
No ratings yet
Sta404 - Chapter 5 - Bivariate Analysis (Student)
27 pages
The Simple Linear Regression Model and Correlation
100% (1)
The Simple Linear Regression Model and Correlation
64 pages
Course Notes For Unit 6 of The Udacity Course ST101 Introduction To Statistics PDF
No ratings yet
Course Notes For Unit 6 of The Udacity Course ST101 Introduction To Statistics PDF
23 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Low Voltage Diesel Powered Generators: Standard Technical Specification
No ratings yet
Low Voltage Diesel Powered Generators: Standard Technical Specification
49 pages
Goegraphy 3
No ratings yet
Goegraphy 3
8 pages
#몽땅쌤 수능감잡기 0지문
No ratings yet
#몽땅쌤 수능감잡기 0지문
32 pages
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
W7 Dmitriy-Zinovev Descriptive Stats
0% (1)
W7 Dmitriy-Zinovev Descriptive Stats
19 pages
XSTK Project PDF
No ratings yet
XSTK Project PDF
26 pages
The AI Power Paradox
No ratings yet
The AI Power Paradox
16 pages
Chapter 13 PowerPoint
No ratings yet
Chapter 13 PowerPoint
36 pages
Regression Using Excel
No ratings yet
Regression Using Excel
18 pages
Hafta Cheuvagnion - Statistical Analysis of Experimental Data - PPT (Uyumluluk Modu)
No ratings yet
Hafta Cheuvagnion - Statistical Analysis of Experimental Data - PPT (Uyumluluk Modu)
16 pages
LAS 9 OMPOY 2nd Quarter 2
No ratings yet
LAS 9 OMPOY 2nd Quarter 2
13 pages
Il Sa330 Puma 09 2022
No ratings yet
Il Sa330 Puma 09 2022
20 pages
P11A Ganjil
No ratings yet
P11A Ganjil
13 pages
Business Mathematics & Statistics
No ratings yet
Business Mathematics & Statistics
31 pages
Research Methodology - 11
No ratings yet
Research Methodology - 11
3 pages
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
No ratings yet
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
31 pages
Chapter 12: Regression
No ratings yet
Chapter 12: Regression
10 pages
Data Analysis ToolPak For Statistics
No ratings yet
Data Analysis ToolPak For Statistics
10 pages
Inverted Pendulum
No ratings yet
Inverted Pendulum
9 pages
Drug Study
No ratings yet
Drug Study
16 pages
POL Final
No ratings yet
POL Final
6 pages
Psych Stat Reviewer Midterms
No ratings yet
Psych Stat Reviewer Midterms
10 pages
Regression and Correlation
No ratings yet
Regression and Correlation
14 pages
Topic - 9 PDF
No ratings yet
Topic - 9 PDF
12 pages
Neet Questions
No ratings yet
Neet Questions
4 pages
Gold-First-NE-2015-Exam-Maximiser-Answer-Key First For Schools - Answer Key UNIT 1 Vocabulary 1 - Studocu PDF
No ratings yet
Gold-First-NE-2015-Exam-Maximiser-Answer-Key First For Schools - Answer Key UNIT 1 Vocabulary 1 - Studocu PDF
1 page
GE MATH1 SG 2023 Lesson 4
No ratings yet
GE MATH1 SG 2023 Lesson 4
6 pages
SPSS Instruction
No ratings yet
SPSS Instruction
14 pages
3rd - REACTION PAPER-SYMPOSIUM UN DESA - CHERRY ANN ARSENAL-final
No ratings yet
3rd - REACTION PAPER-SYMPOSIUM UN DESA - CHERRY ANN ARSENAL-final
5 pages
Linear Regression Analysis: Module - Iv
No ratings yet
Linear Regression Analysis: Module - Iv
10 pages
Influence of Corneal Curvature On Calculation of Ablation Patterns Used in Photorefractive Laser Surgery
No ratings yet
Influence of Corneal Curvature On Calculation of Ablation Patterns Used in Photorefractive Laser Surgery
4 pages
The Realm
No ratings yet
The Realm
3 pages
Chapter 4 Demand Estimation
No ratings yet
Chapter 4 Demand Estimation
9 pages
Data Sci Linkedin
No ratings yet
Data Sci Linkedin
3 pages
Output Input Linear Correlation Coefficient Regression Analysis
No ratings yet
Output Input Linear Correlation Coefficient Regression Analysis
6 pages
Probability Rubric 2
No ratings yet
Probability Rubric 2
1 page
Lines of Regression: Graphing Calculator Lab
No ratings yet
Lines of Regression: Graphing Calculator Lab
3 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

PFC - Workshop 08: Statistics

Uploaded by

PFC - Workshop 08: Statistics

Uploaded by

PFC - Workshop 08

For a sample containing n values, the key statistical indicators include

ss = x12 + x22 + x32 + ... + xn2

4. Standard deviation (d)

The program output might look something like:

Regression Analysis Calculator

a = [ (x1 - mx)(y1 - my) + (x2 - mx)(y2 - my) +

where m denotes the mean, which is given by

7. Pearson Correlation Coefficient given by the formula

r = { [ (x1y1) + (x2y2) + ... + (xnyn) ] / n - mx my} / [ dx dy ]

where d denotes the standard deviation, which is given by

where s denotes the sum of the squares, which is given by

The slope of the least squares fit is 0.26

What is the age of your car in months ? 32

You might also like