0% found this document useful (0 votes)

11 views52 pages

Data Analysis Training Workshop - Day 2 Presentation

The document outlines a virtual Data Analysis Training Workshop scheduled for June 24-26, 2024, presented by Dr. Reesha Kara, covering topics such as MS Excel, descriptive statistics, and hypothesis testing. It includes detailed sections on variable construction, coding and labeling, measures of central tendency, standard deviation, and various statistical tests. The workshop aims to equip participants with essential data analysis skills and understanding of hypothesis testing methodologies.

Uploaded by

Philemon Katambarare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views52 pages

Data Analysis Training Workshop - Day 2 Presentation

Uploaded by

Philemon Katambarare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Institute of Social and Economic

Research, Rhodes University

Data Analysis Training Workshop

24- 26 June 2024
Day 2
Virtual workshop

Presented by Dr Reesha Kara

Structure of the day
Section A: Introduction to Ms Excel Section C: Introduction to Hypothesis
Testing
1. Variable Construction 1. What is a Hypothesis and Hypothesis
Testing?
2. Coding and Labelling
2. Definitions
3. Tests of Significance
Section B: Descriptive Statistics 4. Statistical Tests used in Hypothesis Testing
1. Measures of Central Tendency 5. Understanding the p-value
2. Standard Deviation 6. One-Sample T-Test
3. PivotTables 7. Two-Sample T-Test
8. Chi-Square Test of Independence
Section A: Introduction to MS
Excel
1. Variable Construction
2. Coding and Labelling of Variables
1. Variable Construction
1. Variable construction is important when you need to create derived
variables from your existing data to answer a research question

2. Using the dataset 1 in the ‘example1.xlsx’ datafile:

a. Create a variable that represents the average score of test 1 and test 2 for each
student.
b. Ensure that the variable is labelled correctly.
c. Round off the average to the nearest whole number.
d. Reorder the data in descending order (highest to lowest)
e. Identify the first 5 students with the lowest average score from test 1 and test 2
2. Coding and Labelling Variables
1. Generating a new variable, already done
2. Different way:
a. Assign a value to a labelled variable
b. Can be used for mathematical functions
c. Can be used for coding data
2. Coding and Labelling Variables
1. Click on the Formulas tab on the ribbon
2. Select Define Name
3. A pop out box headed ‘New Name’ will appear on the screen
2. Coding and Labelling Variables

Name – variable name

Scope – which sheet or
workbook you want the
variable to appear in
Comment – details about
the variable
Refers to – the value that
you would like to assign to
the variable
2. Coding and Labelling Variables

Name manager allows

you to edit the variables
by:

• renaming variables
• changing the values
of variables
• deleting variables
creating new
variables
Exercise: Coding and Labelling Variables
1. Using dataset 1 from the ‘example1.xlxs’ datafile:
a. Create a new variable called gender
b. Assign the code 1 to all males
c. Assign the code 2 to all females
d. What is the total number of males in the class?
e. What is the total number of females in the class?

f. Create a new variable called race

g. Assign codes as follows:
 1 – Black
 2 – Coloured
 3 – Indian
 4 – White
h. What is the racial distribution of the class?

i. Create a variable to represent the average score of each test

j. On average what was the average score of this exam period?
Section B: Descriptive Statistics
1. Measures of Central Tendency
2. Standard Deviation
3. PivotTables
1. Measures of central tendency
A. Mean
Population
B. Median Sample
C. Mode

1. Measures of central tendency are useful in describing data and aids

in understanding the distribution of the data
2. They are also referred to as measures of location as the pinpoint
the centre of the distribution of the data
A. The Mean
1. The Mean is the most widely used measure of location
2. It is calculated by summing the values and dividing by the number
of values

ΣX
X =
n
Example
A household consisting of five household members with the following
ages: 45, 49, 25, 19, 11

What is the average age of household members?

Σ𝑋𝑋 45+49+25+19+11 149

̄
𝑋𝑋 = = = = 29.8
𝑛𝑛 5 5
A. Characteristics of the Mean
1. Every set of interval-level and ratio-level data has a mean
2. All the values are included in computing the mean
3. A set of data has a unique mean
4. The mean is affected by unusually large or small data values
B. The Median
1. The Median is the midpoint of the values after they have been
ordered from the smallest to the largest
2. There are as many values above the median as below it in the data
array
3. For an even set of values, the median will be the arithmetic average
of the two middle numbers
Examples
Example 1
1. The ages for a sample of five students are: 21, 25, 19, 20, 22
2. Arranging the data in ascending order gives: 19, 20, 21, 22, 25.
3. Thus the median is 21.

Example 2
1. The heights of four basketball players, in inches, are: 76, 73, 80, 75
2. Arranging the data in ascending order gives: 73, 75, 76, 80.
3. Thus the median is 75.5
B. Characteristics of the Median
1. There is a unique median for each data set
2. It is not affected by extremely large or small values and is therefore,
a valuable measure of central tendency when such values occur
3. It can be computed for ratio-level, interval-level, and ordinal-level
data
C. The Mode
1. The mode is the value of the observation that appears the most
frequently

Example: The exam scores for ten students are:

81, 93, 84, 75, 68, 87, 81, 75, 81, 87
Rearrange the score in ascending order:
68, 75, 75, 81, 81, 81, 84, 87, 87, 93
Because the score of 81 occurs the most often, it is the mode.
1. Symmetric distribution
Zero skewness

mode = median = mean

1. Right skewed distribution
Positively skewed: Mean and Median are to the right of the Mode

Mode<Median<Mean
1. Left skewed distribution
Negatively Skewed: Mean and Median are to the left of the Mode

Mean<Median<Mode
2. Standard Deviation
• Measure of dispersion
• Details the spread of the observed values around the mean
• Used in conjunction with the mean to summarise continuous data
• The standard deviation is influenced by the presence of outliers in a
dataset Low standard deviation = the data is closely dispersed around
the mean
2 High standard deviation = observations are widely dispersed
Σ 𝑋𝑋 − 𝑋𝑋̄
𝑠𝑠 = around the mean, estimates are usually unreliable
𝑛𝑛 − 1
Where,
𝑠𝑠 = sample standard deviation
Σ = sum of...
𝑋𝑋̄ = sample mean
𝑛𝑛 = number of scores in sample
3. PivotTables
1. Summarise, sort, reorganise, group, count, total or average data
2. Cross-tabulations
3. Make comparisons and identify patterns and trends
4. Flexible table development
a. Transform rows into columns and columns into rows
b. Group data by different fields
3. Steps to build a PivotTable
1. Select the cells you want to include in the PivotTable
2. Select insert tab on the ribbon and select PivotTable
3. A pop up box will appear, under Choose the data that you want to
analyze, select Select a table or range
4. In Table/Range, verify the cell range.
5. Under Choose where you want the PivotTable report to be placed,
select New worksheet to place the PivotTable in a new worksheet
or Existing worksheet and then select the location you want the
PivotTable to appear.
6. Select OK
3. Steps to build a PivotTable
7. To add a field to your PivotTable, select the field name checkbox in
the PivotTables Fields pane.
8. To move a field from one area to another, drag the field to the
target area.
9. Selected fields are added to their default areas:
a. Non-numeric fields are added to rows
b. Date and time hierarchies are added to columns
c. Numeric fields are added to values
Examples
Using data from the ‘Example 2.xlxs’ data file:

1. Make a PivotTable showing the average weight of dogs by food type

2. Create a PivotTable showing the average height of each dog by dog
type and brand of food
3. Develop a PivotTable detailing the average weight of dogs by food
brand
4. Create a PivotTable showing the average age and weight of dogs by
food brand
5. Provide a the max and min age of dogs by food type
Section C: Introduction to
Hypothesis Testing
1. What is Hypothesis Testing?
2. Definitions
3. Tests of Significance
4. Statistical Tests used in Hypothesis Testing
5. Understanding p-values
6. One-Sample T-Test
7. Two-Sample T-Test
8. Chi-Square Test of Independence
1. What is a Hypothesis and Hypothesis
Testing?
• A Hypothesis is a statement about an expectation or prediction that
will be tested by research
• Tentative statement about the relationship between two or more
variables
• It is developed for the specific purpose of testing and is informed by
the body of existing literature
• Hypothesis testing is a procedure used to determine whether the
hypothesis is a reasonable statement and should not be rejected, or is
unreasonable and should be rejected.
Example of a Hypothesis
A study designed to look at the relationship between sleep deprivation
and exam performance might have a hypothesis that states:

This study is designed to assess the hypothesis that sleep-deprived

people will perform worse on a test than individuals who are not
sleep-deprived
2. Definitions
1. A Null Hypothesis H0 : A statement that will be tested
2. An Alternative Hypothesis H1 : A statement that is accepted if the
sample data provides evidence that the null hypothesis is false
3. A Level of Significance: The probability of rejecting the null
hypothesis when it is actually true
a. Denoted as α (alpha)
b. 0.05 level of significance is the accepted standard in social science research
2. Definitions
4. A Type I Error: Rejecting the null hypothesis when it is actually true.
5. A Type II Error: Failing to reject the null hypothesis when it is
actually false.
6. A Test statistic: A value, determined from sample information, used
to determine whether or not to reject the null hypothesis.
7. A Critical value: (Rejection region) The dividing point between the
region where the null hypothesis is rejected and the region where it
is not rejected.
3. One-Tailed Test of Significance
A test is one-tailed when the alternate hypothesis, H1, states a direction

Example:
• H0 : There is no difference in the average height of males and females
• H1 : Males have a higher average height compared to females OR on average,
males are taller than females
3. Two-Tailed Test of Significance
A test is two-tailed when no direction is specified in the alternate
hypothesis H1 , such as:

Example:
• H0 : The mean amount spent by customers at Pick ‘n Pay in Grahamstown on
any day of the week is equal to R450.00
• H1 : The mean amount spent by customers at Pick ‘n Pay in Grahamstown on
any day of the week is not equal to R450.00 (µ ≠ R450).
4. Statistical Tests that are used in
Hypothesis Testing

1. One – Sample T-Test (Student’s T Test)

2. Two – Sample T-Test
3. Chi- Square Test of Independence
4. Pearson's Correlation Analysis
5. Regression Analysis
5. Understanding p-values
1. Used to support or reject the null hypothesis
2. Smaller p-value, stronger evidence to reject the null
3. Smaller p-value, more significant results Assumed standard in Social Science
research
4. P-value compared to the alpha Confidence level = 95%
100% – 95%
5. Alpha levels are related to confidence levels 5%
Therefore, α = 0.05
Compare the generated p-value to the
alpha value
Small p-value (≤0.05), reject the null hypothesis. Strong evidence
that the null hypothesis is invalid.
Large p-value (≥0.05), fail to reject the null hypothesis. Alternate
hypothesis is weak.
6. One-Sample T-Test
1. Also called a Student’s T-Test
2. Allows for the comparison of a population mean with a
hypothesised value or a one-sample mean
3. Aim of the test is to test whether the means are statistically
different
4. Example …
Examples
1. Historical data shows that the average score on the Politics exam is 67%. Does
this differ from the current year Politics average exam score?
a. Null hypothesis (H0): There is no difference between the historical average score on the Politics exams
and the current average score OR the current average score on the Politics exam is 67%
b. Alternative hypothesis (H1): there is a statistically significant difference in the average exam scores
c. One-tailed or two-tailed test? Two-tailed test

2. Assume that hospital records show that on average babies are 3kgs at birth. We
have collected birth weight data during the lockdown period of the covid
pandemic and want to test if babies born during this lockdown period are born
at a lower weight compared to the historical birth weight records
a. Null hypothesis (H0): There is no difference between the average birth weight of babies and the birth
weight of babies born during covid
b. Alternative hypothesis (H1): on average, babies born during the covid pandemic have a lower birth
weigh compared to those born before the pandemic
c. One-tailed or two-tailed test? One-tailed test
6. Assumptions of a One-Sample T-Test
1. Data is continuous and of a quantitative scale – interval and ratio
2. The sample should be randomly selected from the population
3. The samples are independent of each other
4. Data should be normally distributed
5. Parametric test
6. Assumes that the data does not include any outliers
7. Standard deviation of the population unknown
8. Sample size below 30
6. Practical example
Using dataset 1 from the ‘example1.xlxs’ data file:

1. Calculate whether the average test score for Test 3 is equal to 25

a. State the null and alternate hypothesis
i. Null – there is no difference between average for test 3 and 25
ii. Alternate – there is a difference between the average score for test 3 and 25 OR there average
score for test 3 is not equal to 25
iii. Is it a one-tailed or two-tailed test? Two-tailed test
b. Conclusion – there is a statistically significant difference between the average score for test 3 and 25

2. Calculate whether the average test score for Test 4 is greater than 30
a. State the null and alternate hypothesis
i. Null – The average test 4 score is not greater than 30 Or there is no difference between test 4
average and 30
ii. Alternate – The average test 4 score is greater than 30
iii. Is it a one-tailed or two-tailed test? One-tailed test
b. Conclusion – the average score for test 4 is statistically higher than 30
6. Interpretation of Results
1. The statistical test will produce various statistics
2. When interpreting the data, look out for the p-value, t-statistic and
critical value – all of these are generated in the statistical test
3. If the:
a. T-statistic is larger than the critical value If the:
T-stat > Critical Value
b. P-value is smaller than 0.05
P-value < 0.05
c. Then you reject the null hypothesis
Reject the null hypothesis
7. Two-Sample T-Test
1. Also called the Independent Samples T-Test
2. Used to test whether the unknown means of two independent
samples are different (or the same, equal)
3. Two variables:
a. Variable one: defines the two groups
b. Variable two: measurement of interest
Examples
1. Assume that we have two groups of students. One group has English as a first language and the
second group has English as their second language. Both groups of students take a reading
test. The assumption is that there is a difference in the average test scores across both groups
of students.
a. Null hypothesis (H0): There is no difference between the test scores of the English first language and second language
groups of students
b. Alternative hypothesis (H1): There is a statistically significant difference in the test scores between the English first language
and second language groups of students
c. One-tailed or two-tailed test: Two-tailed test

2. Assume that we have a random sample of males and females from a school of interest, and we
want to test whether there is a difference in the average hours per week spent in the school
library. The assumption is that female students spent more time in the library per week
compared to male students.
a. Null hypothesis (H0): There is no difference in the average hours spent in the library per week by male and female students
b. Alternative hypothesis (H1): On average, female students spend more time in the library per week compared to male
students
c. One-tailed or two-tailed test: One-tailed test
7. Assumptions of a Two-Sample T-Test
1. Data values must be independent
2. Sample selected through simple random sampling
3. Data in each group are normally distributed
4. Parametric test
5. Data values are continuous, ratio or interval scale
6. The variances for the two groups are equal
7. Practical Examples
Using dataset 1 from the ‘example1.xlxs’ data file:

1. Test whether there is a difference in the mean score between Test 1 and Test 2
a. Null – There is no difference between the mean score of test 1 and test 2
b. Alternate – There is a statistically significant difference between the mean score of test 1 and test 2
c. Is it a one-tailed or two-tailed test: Two-tailed test
d. Conclusion – There is a statistically significant difference between the average scores of test 1 and test 2

2. Test whether there is a difference between the mean score of Test 2 and Test 4
a. Null – There is no difference between the mean score of test 2 and test 4
b. Alternate – There is a statistically significant difference between the mean score of test 2 and test 4
c. Is it a one-tailed or two-tailed test: Two-tailed test
d. Conclusion –
7. Interpretation of the Results
1. The statistical test will produce various statistics
2. When interpreting the data, look out for the p-value, t-statistic and
critical value – all of these are generated in the statistical test
3. If the:
a. T-statistic is larger than the critical value If the:
T-stat > Critical Value
b. P-value is smaller than 0.05
P-value < 0.05
c. Then you reject the null hypothesis
Reject the null hypothesis
8. Chi-Square Test of Independence
1. Tests whether there is an association between two categorical
variables
2. Each variable should have at least 2 categories
3. Non-parametric test, nominal or ordinal scale
4. Does not assume that the data is normally distributed
5. Makes use of contingency tables to analyse the data
6. It does not show the strength or direction of the association
Examples
Test whether there is an association between gender and monthly
income
a. Null hypothesis (H0): there is no association between gender and monthly income in
the NIDS 2008 dataset
b. Alternative hypothesis (H1): there is an association between gender and monthly
income in the NIDS 2008 dataset
Average monthly income categories
0 – 5000 5001 – 10 000 10 001 – 15 000 15 001 – 20 000 Total
Gender

Male
Female
Total
Examples
Test whether there is an association between geographic location and
employment type
a. Null hypothesis (H0): there is no association between geographic location and employment type
b. Alternative hypothesis (H1): there is an association between geographic location and
employment type
Employment type
Formal Informal Total
Geographic

Urban
location

Rural
Farms
Total
8. Assumptions of a Chi-Square Test of
Independence
1. Variables must be on the nominal or ordinal scale
2. Categories of variables must be mutually exclusive
3. Sample selected through simple random sampling
4. Data in the contingency table needs to be frequencies or counts
5. Using the observed frequencies in the contingency table, the
expected frequencies are calculated

Expected Frequencies = (row total*column total)/grand total

8. Practical examples
Using dataset 1 from the ‘NIDS 2008_practice data.xlxs’
1. Test whether there is an association between gender and highest level of
education in the NIDS 2008 subsample data
a. Null –
b. Alternative –
c. Conclusion –

Using dataset 2 from the ‘NIDS 2008_practice data.xlxs’

2. Test whether there is an association between highest level of education
and employment status among the NIDS 2008 subsample
a. Null –
b. Alternate –
c. Conclusion –
8. Interpretation of the Results
Steps to follow when conducting a Chi-Square Test of Independence:
1. Calculate the expected frequency using the formula
2. Calculate the p-value
3. Interpret the p-value
4. Draw a conclusion Interpreting the p-value

If p-value < 0.05

Reject the null hypothesis
End of day 2, thank you!

It0089 Finalreviewer
100% (1)
It0089 Finalreviewer
143 pages
IBA Chapter 3 Slides Final Accessible
No ratings yet
IBA Chapter 3 Slides Final Accessible
61 pages
MMW Data Management
No ratings yet
MMW Data Management
87 pages
Cebu - Day 1 (Descriptive Statistics Lecture) Part 1
No ratings yet
Cebu - Day 1 (Descriptive Statistics Lecture) Part 1
107 pages
Lecture Week 2 Statistics
No ratings yet
Lecture Week 2 Statistics
57 pages
A MMW
No ratings yet
A MMW
42 pages
MC Stat
No ratings yet
MC Stat
150 pages
E-Note 33325 Content Document 20250319114322AM
No ratings yet
E-Note 33325 Content Document 20250319114322AM
69 pages
Data Analysi Quantitative Lesson 4
No ratings yet
Data Analysi Quantitative Lesson 4
24 pages
Data Analysis and Statistical Treatment
No ratings yet
Data Analysis and Statistical Treatment
99 pages
MC Stat
No ratings yet
MC Stat
101 pages
Analysis of Data - Unit III (New)
No ratings yet
Analysis of Data - Unit III (New)
90 pages
Statistics
No ratings yet
Statistics
28 pages
Week 2 Quantitative Data Analysis
No ratings yet
Week 2 Quantitative Data Analysis
22 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Stats Lecture 1
No ratings yet
Stats Lecture 1
45 pages
Intro SRM
No ratings yet
Intro SRM
73 pages
Statistics Analysis
No ratings yet
Statistics Analysis
6 pages
Lecture 2 - MAT361 (21 JAN 2025)
No ratings yet
Lecture 2 - MAT361 (21 JAN 2025)
40 pages
2statistical Analysis of Data 2
No ratings yet
2statistical Analysis of Data 2
43 pages
3RD Quarter Statistics and Probability
No ratings yet
3RD Quarter Statistics and Probability
7 pages
Mathworld Reviewer Stats
No ratings yet
Mathworld Reviewer Stats
4 pages
Assessment
No ratings yet
Assessment
43 pages
3 Descriptive Statistics PDF
No ratings yet
3 Descriptive Statistics PDF
58 pages
Data Management
No ratings yet
Data Management
36 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel
48 pages
It0089 Finalreviewer
No ratings yet
It0089 Finalreviewer
143 pages
Chapter 5 - RM
No ratings yet
Chapter 5 - RM
22 pages
SPSS Data Analysis
No ratings yet
SPSS Data Analysis
47 pages
Statistics 101
100% (1)
Statistics 101
20 pages
Measure of Central Tendency
No ratings yet
Measure of Central Tendency
40 pages
JS LifeScience Syllabus2024
No ratings yet
JS LifeScience Syllabus2024
56 pages
Main Title: Planning Data Analysis Using Statistical Data
100% (1)
Main Title: Planning Data Analysis Using Statistical Data
40 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
Inferential Statistics
No ratings yet
Inferential Statistics
48 pages
250 Lec 5 Fall 13
No ratings yet
250 Lec 5 Fall 13
42 pages
Statistical Organization of Scores
No ratings yet
Statistical Organization of Scores
109 pages
DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi
100% (1)
DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi
120 pages
Statistics Refresher
No ratings yet
Statistics Refresher
11 pages
Statistics
100% (4)
Statistics
124 pages
Data Management
No ratings yet
Data Management
48 pages
Sas Stat
No ratings yet
Sas Stat
44 pages
Basics of Biostatistics: DR Sumanth MM
No ratings yet
Basics of Biostatistics: DR Sumanth MM
27 pages
SPSS Data Analysis
100% (6)
SPSS Data Analysis
47 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
No ratings yet
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
29 pages
Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word
No ratings yet
Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word
42 pages
Lecture 7.descriptive and Inferential Statistics
No ratings yet
Lecture 7.descriptive and Inferential Statistics
44 pages
Capstone Project - Final Submission
No ratings yet
Capstone Project - Final Submission
36 pages
Statistical Techniques - Bda
No ratings yet
Statistical Techniques - Bda
33 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Presenting Data: Descriptive Statistics
No ratings yet
Presenting Data: Descriptive Statistics
21 pages
Aubé & Rousseau 2005
No ratings yet
Aubé & Rousseau 2005
16 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
42 pages
Grade 11 Mathematics
No ratings yet
Grade 11 Mathematics
13 pages
ML Unit 2
No ratings yet
ML Unit 2
21 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Statistic Frequency Distribution
100% (4)
Statistic Frequency Distribution
66 pages
Dsur I Chapter 18 Categorical Data
No ratings yet
Dsur I Chapter 18 Categorical Data
47 pages
10.2 Fertilisers QP
No ratings yet
10.2 Fertilisers QP
4 pages
Statistics - The Big Picture
No ratings yet
Statistics - The Big Picture
4 pages
Section A
No ratings yet
Section A
6 pages
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
No ratings yet
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
11 pages
Financial Markets
No ratings yet
Financial Markets
5 pages
10.3 Air Quality and Climate MCQ MS
No ratings yet
10.3 Air Quality and Climate MCQ MS
1 page
Basics 1 Vario Gram
No ratings yet
Basics 1 Vario Gram
37 pages
Financial Markets
No ratings yet
Financial Markets
8 pages
Lesson 4 My Path Task - WorldQuant University
No ratings yet
Lesson 4 My Path Task - WorldQuant University
6 pages
Visual Basic 2010
No ratings yet
Visual Basic 2010
94 pages
10.1 Water MCQ QP
No ratings yet
10.1 Water MCQ QP
5 pages
Statistics: An Introduction and Overview
No ratings yet
Statistics: An Introduction and Overview
51 pages
Quiz Data Management MMW FINAL 1
No ratings yet
Quiz Data Management MMW FINAL 1
30 pages
Elementary Statistics A Step by Step Approach 9th Edition Bluman Test Bank PDF Download
100% (2)
Elementary Statistics A Step by Step Approach 9th Edition Bluman Test Bank PDF Download
65 pages
Measures of Spread
No ratings yet
Measures of Spread
5 pages
Junior Secondary Certificate: Physical Science
No ratings yet
Junior Secondary Certificate: Physical Science
29 pages
Data Analysis Training Workshop - Day 1 Presentation
No ratings yet
Data Analysis Training Workshop - Day 1 Presentation
23 pages
As - Mathematics Paper 2 8227-2 - First Proof 13.03.2023
No ratings yet
As - Mathematics Paper 2 8227-2 - First Proof 13.03.2023
20 pages
Self Models SM
No ratings yet
Self Models SM
16 pages
Scatter Diagrams
No ratings yet
Scatter Diagrams
12 pages
10.2 Fertilisers MCQ QP
No ratings yet
10.2 Fertilisers MCQ QP
2 pages
Ken Black QA 5th Chapter 11 Solution
No ratings yet
Ken Black QA 5th Chapter 11 Solution
30 pages
Grade 10 Chemistry
No ratings yet
Grade 10 Chemistry
5 pages
Characteristic Functions: Module 1: Lesson 2
No ratings yet
Characteristic Functions: Module 1: Lesson 2
10 pages
Fourier-Based Option Pricing: Module 1: Lesson 1
No ratings yet
Fourier-Based Option Pricing: Module 1: Lesson 1
10 pages
8-F-Test (Two-Way Anova With Interaction Effect)
No ratings yet
8-F-Test (Two-Way Anova With Interaction Effect)
14 pages
Warranty Data Analysis: A Review: Shaomin Wu
No ratings yet
Warranty Data Analysis: A Review: Shaomin Wu
21 pages
Week2 Slides Predictive+Analysis
No ratings yet
Week2 Slides Predictive+Analysis
28 pages
Price Elasticity in Motor Insurance
No ratings yet
Price Elasticity in Motor Insurance
34 pages
Tutorial 5 - Inequalities
No ratings yet
Tutorial 5 - Inequalities
1 page
QUESTIONS TRIAL KMJ AM025 - Part2
No ratings yet
QUESTIONS TRIAL KMJ AM025 - Part2
2 pages
Discrete Fourier Transform: Module 1: Lesson 3
No ratings yet
Discrete Fourier Transform: Module 1: Lesson 3
6 pages
Nitrogen & Fertilisers 2 MS
No ratings yet
Nitrogen & Fertilisers 2 MS
3 pages
Thesis Final Version Julian Van Erk
No ratings yet
Thesis Final Version Julian Van Erk
30 pages
Financial Markets
No ratings yet
Financial Markets
8 pages
Mean
No ratings yet
Mean
7 pages
Amiri2024 Papre-Iranianjournal
No ratings yet
Amiri2024 Papre-Iranianjournal
11 pages
21 Goodness of Fit
No ratings yet
21 Goodness of Fit
16 pages
Sta404 Chapter 08
No ratings yet
Sta404 Chapter 08
120 pages
Lecture 7
No ratings yet
Lecture 7
6 pages
8614 - Assignment 2 Solved (AG)
No ratings yet
8614 - Assignment 2 Solved (AG)
19 pages
Tasks For Students-1
No ratings yet
Tasks For Students-1
3 pages
Batch Effects Correction For Metabolomics: Andrés G. Camacho-Bonet and Wandaliz Torres-García, PH.D
No ratings yet
Batch Effects Correction For Metabolomics: Andrés G. Camacho-Bonet and Wandaliz Torres-García, PH.D
20 pages
Evaluation and Cross Validation Detailed
No ratings yet
Evaluation and Cross Validation Detailed
2 pages
Forecasting Thailand's Rice Export: Statistical Techniques vs. Artificial Neural Networks
No ratings yet
Forecasting Thailand's Rice Export: Statistical Techniques vs. Artificial Neural Networks
18 pages
Mgt782 Midterm New
No ratings yet
Mgt782 Midterm New
2 pages
Statistics Formula
No ratings yet
Statistics Formula
4 pages
PTSP Ii Ece
No ratings yet
PTSP Ii Ece
3 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Statistical Analysis and Visualization
From Everand
Statistical Analysis and Visualization
Mohit Chatterjee
No ratings yet
Scientific Management of the Classroom
From Everand
Scientific Management of the Classroom
Pernell Hodges
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

Data Analysis Training Workshop - Day 2 Presentation

Uploaded by

Data Analysis Training Workshop - Day 2 Presentation

Uploaded by

Institute of Social and Economic

Research, Rhodes University

Data Analysis Training Workshop

Presented by Dr Reesha Kara

2. Using the dataset 1 in the ‘example1.xlsx’ datafile:

Name – variable name

Name manager allows

f. Create a new variable called race

i. Create a variable to represent the average score of each test

1. Measures of central tendency are useful in describing data and aids

What is the average age of household members?

Σ𝑋𝑋 45+49+25+19+11 149

Example: The exam scores for ten students are:

mode = median = mean

1. Make a PivotTable showing the average weight of dogs by food type

This study is designed to assess the hypothesis that sleep-deprived

1. One – Sample T-Test (Student’s T Test)

1. Calculate whether the average test score for Test 3 is equal to 25

Expected Frequencies = (row total*column total)/grand total

Using dataset 2 from the ‘NIDS 2008_practice data.xlxs’

If p-value < 0.05

You might also like