0% found this document useful (0 votes)

24 views12 pages

STA3022F Exam June 2013

Uploaded by

alutakaunda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views12 pages

STA3022F Exam June 2013

Uploaded by

alutakaunda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

UNIVERSITY OF CAPE TOWN

DEPARTMENT OF STATISTICAL SCIENCES

STA3022F
JUNE 2013 EXAMINATION

INTERNAL EXAMINERS: A/Prof S Lubbe, Dr Ş Er TOTAL MARKS: 100

INTERNAL ASSESSOR: A/Prof F Little
EXTERNAL EXAMINER: Dr T Berning TIME ALLOWED: 3 hours
PAGES: 14
INSTRUCTIONS: ANSWER EACH SECTION IN A SEPARATE BOOK.
ALL QUESTIONS MAY BE ATTEMPTED.
MARKS ARE ALLOCATED FOR INTERMEDIATE CALCULATIONS.

SECTION A: EXPLORATORY METHODS [Available marks: 51]

Question 1 [3 marks]
Consider a data set of the monthly inflation rate over several years for all African countries.
(a) Would you consider inflation rate to be a numerical, ordinal categorical or nominal variable. (1)
(b) If the inflation figure for Zimbabwe for June 2012 is missing, name one method of overcoming this
problem. (1)
(c) You want to perform an analysis that requires normally distributed data. You plotted histograms of the
data, some of which are shown below. Since the data is very skew, what can be done before embarking
on the analysis? (1)
300
400

250
150

300

200
Frequency

Frequency

Frequency
100

150
200

100
100
50

50
0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.2 0.4 0.6 0.8 0.0 0.1 0.2 0.3 0.4

x x x
200
250

150
150
200
Frequency

Frequency

Frequency
150

100
100
100

50
50
50
0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.1 0.2 0.3 0.4 0.5

x x x

1
Question 2 [15 marks]
Consider the following contingency table and associated R output. In the table below, 300 people
were asked what bank they used most often and to identify the most important reason why they
chose that bank.

Helpful Enjoy Branch close Keeps me Many Competitive

TOTAL
staff advertising to home informed ATMs interest rates
FNB 14 14 2 23 7 12 72
ABSA 13 6 10 15 5 21 70
NEDBANK 11 16 8 4 10 8 57
STDBANK 12 14 30 8 28 9 101
TOTAL 50 50 50 50 50 50 300

> ca(bank.data)
Principal inertias (eigenvalues):
1 2 3
Value 0.174676 0.046073 0.021778
Percentage 72.02% 19% 8.98%

Rows:
FNB ABSA NEDBANK STDBANK
Mass 0.240000 0.233333 0.190000 0.336667
ChiDist 0.542201 0.468606 0.383164 0.525126
Inertia 0.070556 0.051238 0.027895 0.092838
Dim. 1 -1.172400 -0.794016 0.278962 1.228645
Dim. 2 -0.781987 1.457919 -1.315956 0.289685

Columns:
Branch Competitive
Helpful Enjoy close Keeps me Many interest
staff advertising to home informed ATMs rates
Mass 0.166667 0.166667 0.166667 0.166667 0.166667 0.166667
ChiDist 0.205443 0.400249 0.618174 0.614089 0.516271 0.476418
Inertia 0.007034 0.026700 0.063690 0.062851 0.044423 0.037829
Dim. 1 -0.427021 0.023291 1.378472 -1.336570 1.197046 -0.835218
Dim. 2 -0.278986 -1.788994 1.041542 0.087283 -0.301211 1.240367

(a) In order to test the hypothesis 𝐻0 : no significant association between bank and reason for using that
bank, give the test statistic and its associated distribution. Also define each of the symbols you use in the
definition of the test statistic and indicate how to calculate or obtain them. (4)

(b) Give an expression for the Pearson residuals calculated for a contingency table. (1)

(c) Once the matrix of Pearson residuals is obtained, each entry is divided by the grand total from the
contingency table. Name and give a mathematical expression for the method of obtaining the CA map
coordinates from this matrix. (2)

(d) Construct a CA map for the contingency table above. (5)

(e) What proportion of variance in the data are you displaying in your map in (d)? (1)

(f) Which bank provides the most competitive interest rates? (1)

(g) Why do customers tend to choose Nedbank? (1)

2
Question 3 [18 marks]

In an analysis of Olympic decathlon scores, 160 complete starts were made by 139 athletes. The
scores for each of the 10 decathlon events were standardized and the signs of the timed events
changed so that large scores are good for all events. The factor analysis output is shown below.

(a) Give the general model for factor analysis for factors 𝐹1 , … , 𝐹𝑞 based on manifest variables
𝑋1 , … , 𝑋𝑝 . (2)

(b) Give the general principal component analysis model for principal components 𝑌1 , … , 𝑌𝑚 based
on manifest variables 𝑋1 , … , 𝑋𝑝 . (2)

(d) Why did the researcher decide to use four factors? (1)

(e) What is the advantage of the varimax rotation? (1)

(f) Interpret and name the four factors. (8)

(g) Give an expression for the communality of the 𝑖-th manifest variable and explain what its
meaning is. (2)
4
3
eigenvalue

2
1

2 4 6 8 10

Index

> eigenvalues
[1] 4.21290 2.88958 2.24310 1.05940 0.91770 0.66980 0.57570 0.42110 0.32190 0.25250

3
> loadings
[,1] [,2] [,3] [,4]
[100m run] -0.6961444 0.02209774 -0.468329773 0.41636799
[long jump] -0.7925464 0.07517162 -0.254696029 0.11462221
[shot put] -0.7710515 -0.43442586 0.197341218 0.11216995
[high jump] -0.7107713 0.18069329 0.004571862 -0.36745024
[400m run] -0.6048010 0.54866998 -0.045077267 0.39698116
[110m hurdles] -0.5126311 -0.08271661 -0.371744685 -0.56108748
[discus] -0.6897880 -0.45643672 0.288597881 0.07771845
[pole vault] -0.7609729 0.16239082 0.018258009 -0.30422480
[javelin] -0.5184056 -0.25162810 0.518908343 0.07356806
[1500m run] -0.2197133 0.74576659 0.493070762 -0.08518221

> varimax(loadings)
[100m run] -0.885 -0.139 0.182 -0.205
[long jump] -0.664 0.201 -0.693
[shot put] -0.023 0.819 -0.152
[high jump] -0.121 0.293 0.237 -0.683
[400m run] -0.746 0.750
[110m hurdles] -0.108 -0.161 -0.826
[discus] -0.185 0.832 -0.204
[pole vault] -0.207 0.193 0.124 -0.656
[javelin] 0.188 0.754
[1500m run] 0.921

[,1] [,2] [,3] [,4]

SS loadings 2.045 1.376 2.234 1.924
Proportion Var 0.205 0.138 0.223 0.192
Cumulative Var 0.205 0.342 0.566 0.758

Question 4 [5 marks]

In order to evaluate the quality of health care at a local clinic, researchers designed a questionnaire
consisting of several statements. The statements are rated on a five point scale and each statement is
designed such that a response of “Totally disagree”, scored 1, indicates poor quality and “Totally
agree”, scored 5, indicates excellent quality. Several aspects of health care quality need to be
evaluated. Questions 2, 5, 8, 15 and 23 is designed to deal with waiting time at the clinic. In order to
assess whether the questions were adequately designed, the researchers did a pilot study asking 20
random clinic visitors to complete the questionnaire. The data below was captured from the pilot
study.

(a) Use Chronbach’s alpha to assess the internal consistency of Questions 2, 5, 8, 15, 23. (3)

(b) Based on your calculation in (a), can you confirm that these questions were well designed to
assess waiting time at the clinic. Motivate your answer. (2)

4
𝑄2 𝑄5 𝑄8 𝑄15 𝑄23 𝑄2 + 𝑄5 + 𝑄8 + 𝑄15 + 𝑄23
1 1 1 1 1 5
5 5 5 1 5 21
1 4 1 3 1 10
4 5 2 3 2 16
2 3 1 1 2 9
5 5 5 3 5 23
3 4 1 1 1 10
2 3 2 3 2 12
5 5 5 3 5 23
1 1 1 1 1 5
5 5 5 3 5 23
2 3 2 2 2 11
5 5 5 5 5 25
5 5 3 5 3 21
1 1 1 1 1 5
5 5 3 3 3 19
5 5 5 5 5 25
2 3 1 3 1 10
2 3 1 1 1 8
1 1 1 1 2 6
Variance 3.042 2.463 3.103 2.050 2.871 54.661

Question 5 [8 marks]

The marketing department of a margarine company would like to segment all eight products in the
market, own and competition, in order to find segments with ‘gaps’ where a new product can be
launched. They asked a panel of 15 evaluators to assess the pairwise differences between the
products. Based on all 15 evaluations, the following dissimilarity matrix was computed.

𝑃1 𝑃2 𝑃3 𝑃4 𝑃5 𝑃6 𝑃7 𝑃8
𝑃1 0 2.5 1.0 2.8 2.9 3.5 2.7 0.8
𝑃2 2.5 0 2.0 1.9 1.9 2.4 0.6 3.7
𝑃3 1.0 2.0 0 1.8 2.1 2.8 3.2 1.5
𝐷 = 𝑃4 2.8 1.9 1.8 0 0.3 1.2 2.1 2.3
𝑃5 2.9 1.9 2.1 0.3 0 0.4 2.0 2.5
𝑃6 3.5 2.4 2.8 1.2 0.4 0 3.8 4.4
𝑃7 2.7 0.6 3.2 2.1 2.0 3.8 0 4.2
𝑃8 [0.8 3.7 1.5 2.3 2.5 4.4 4.2 0]

A hierarchical cluster analysis was performed on the data to segment the products. In the process
the following three clusters were identified. In the next step, two of the three clusters need to be
merged.
Cluster A : Products 𝑃1, 𝑃3, 𝑃8
Cluster B : Products 𝑃2, 𝑃7
Cluster C : Products 𝑃4, 𝑃5, 𝑃6
5
(a) Determine the distance between the clusters and suggest which two clusters need to be merged
next based on the single linkage method. (4)

(b) Determine the distance between the clusters and suggest which two clusters need to be merged
next based on the complete linkage method. (4)

6
SECTION B: PREDICTIVE METHODS [Available marks: 50]

ANSWER EACH SECTION IN A SEPARATE BOOK

Question 6 [15 marks]

The following data was randomly collected from an estate agency website on 98 houses that are for
sale. The data includes the level of the house price (priceLevel), the view that the house has (sea
view, mountain view or no view), the number of rooms in total (totalroom) and the size of the house
(size). The researcher wants to know which of the house attributes can be used in order to
discriminate a house according to the level of its price. The first few rows of the data and the
observed price levels are given in Tables 1 and 2 below. Use the attached output to answer the
questions below.

Price levels and size, total room number and view attributes of the first 20 houses in the data
set.
> ExamQ1All[1:20,18:22]
priceLevel size totalroom viewSea viewMount noView
1 med 81 2 1 0 0
2 med 86 3 1 0 0
3 med 70 2 0 0 1
4 med 62 2 0 0 1
5 low 53 2 0 0 1
6 med 79 2 0 0 1
7 med 81 2 0 0 1
8 med 86 2 1 0 0
9 low 61 2 0 1 0
10 med 52 2 0 1 0
11 med 74 2 0 1 0
12 med 92 2 1 0 0
13 med 84 2 0 1 0
14 med 70 2 0 1 0
15 med 61 2 0 1 0
16 med 79 2 0 0 1
17 high 144 2 0 0 1
18 high 144 2 0 0 1
19 med 95 2 0 0 1
20 med 95 2 0 0 1

(a) Evaluate the hit-rate. (3)

(b) Indicate to which price level group, the discriminant model has assigned house 6 and house 9.
Were these correct assignments? (2)

(c) Write down the discriminant functions. (2)

(d) How useful is the second discriminant function as a predictor of group membership in this
problem situation? Explain. (2)

(e) Calculate the group centroids for the medium priced houses. (2)

7
(f) Is the discriminant model able to statistically discriminate between houses belonging to each of
the three price levels? State appropriate null and alternative hypotheses and justify any
conclusions with supporting statistical evidence. Between which of the groups can the model
discriminate significantly? (4)

Discriminant Analysis Results

> fit1 <- lda(priceLevel~size+totalroom+viewSea+viewMount,
+ data=houseprices, method="moment")

> fit1
Call:
lda(priceLevel ~ size + totalroom + viewSea + viewMount, data = houseprices, method =
"moment")

Prior probabilities of groups:

high low med
0.1428571 0.2857143 0.5714286

Group means:
size totalroom viewSea viewMount
high 138.85714 4.500000 0.4285714 0.2142857
low 60.82143 2.803571 0.0000000 0.2857143
med 78.07143 2.642857 0.2321429 0.3392857

Coefficients of linear discriminants:

LD1 LD2
Constant -4.10345000 -0.82147200
size 0.04945093 -0.01282709
totalroom -0.02052425 0.86211733
viewSea 0.90284141 -1.64665812
viewMount -0.18736819 -1.16439112

Proportion of trace:
LD1 LD2
0.9249 0.0751

Sample Sizes of Each Category

> table(priceLevel)
priceLevel
high low med
14 28 56

Classification Table

> classificationTable=table(ExamQ1All$priceLevel,fitPredict$class)
> classificationTable

high low med

high 12 0 2
low 1 17 10
med 1 6 49

Mahalanobis Distances Between the Groups

> d2
Low and Med Low and High Med and High
[1,] 1.762034 17.87178 10.45382

8
Observed price levels for all of the 98 houses in the data set.

> priceLevel
[1] med med med med low med med med low med med med med med med
[16] med high high med med med med med low med med med med med med
[31] med med med med med med med med med low low med med low med
[46] med med med low high med med med med med med med med med med
[61] high high med high low high high med high high med high high high high
[76] med med med low low low low low low low low low low low low
[91] low low low low low low low low
Levels: high low med

Predicted price levels for all of the 98 houses in the data set.

> fitPredict$class
[1] med med med med low med med med med med med med med med med
[16] med high high med med med med med med med med med med med med
[31] low low med med med med med med med low low med low low med
[46] low med med low high med med high med low med med med med med
[61] high high med high low high high med high high med high high med med
[76] low med med med med low med low low med low low low med low
[91] low low high low low med med med

Question 7 [12 marks]

A financial investor wants to propose a model to help the decision of investment on stock returns
assessing the following attributes:

Variable
Criteria Levels
Code
Sales profitability profit 0= negative profitability ratio, 1=positive profitability ratio
Market-to-Book value mbv 0=Below average, 1= above average
Beta as risk beta 0=less risk, 1= high risk
Profit per share ppershare 0=negative profit per share, 1=positive profit per share
Debt ratio debt 0=low debt ratio, 1=high debt ratio
1=share is in first 30, 2=share is in first 50, 3=share is in first 100,
National indice Indice 4=share is not classified in the first 100

An analyst used Classification Trees to identify an appropriate decision rule to classify future stock
returns as positive or negative return.

In the data set, 310 firms were observed.

Relevant results from the Classification Tree module of R for the input data of the 6 attributes
considered appropriate is given below.

> rpartfit <- rpart(formula, data=datatowork)

> rpartfit
n =310

node), split, n, loss, yval, (yprob)

* denotes terminal node

9
1) root 310 94 positive return (0.3032258 0.6967742)
2) ppershare=negative ppershare 75 35 positive return (0.4666667 0.5333333)
4) mbv=>2 30 13 negative return (0.5666667 0.4333333)
8) debt=high 12 3 negative return (0.7500000 0.2500000) *
9) debt=low 18 8 positive return (0.4444444 0.5555556) *
5) mbv=<=2 45 18 positive return (0.4000000 0.6000000) *
3) ppershare=positive ppershare 235 59 positive return (0.2510638 0.7489362) *

(a) (7)

(b) Interpret the Classification Tree and define an appropriate decision rule for selecting a positive
return. (2)

(c) What percentage of firms is correctly identified as having a positive return by the chosen
criteria? Justify. Use this finding to comment on the reliability of the derived decision rule. (3)

Question 8 [9 marks]

The below model was fitted to data on 32 insurance companies where

𝑌 = 𝑃𝐸 𝑟𝑎𝑡𝑖𝑜 (price-earnings ratio)
𝑋1 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑖𝑛𝑠𝑢𝑟𝑎𝑛𝑐𝑒 𝑐𝑜𝑚𝑝𝑎𝑛𝑖𝑒𝑠, 𝑖𝑛 𝑏𝑖𝑙𝑙𝑖𝑜𝑛𝑠 𝑜𝑓 𝑅𝑎𝑛𝑑𝑠
𝑋2 = 𝐷𝑢𝑚𝑚𝑦 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
𝑡𝑎𝑘𝑖𝑛𝑔 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 1 𝑓𝑜𝑟 𝑟𝑒𝑔𝑖𝑜𝑛𝑎𝑙 𝑐𝑜𝑚𝑝𝑎𝑛𝑖𝑒𝑠 𝑎𝑛𝑑 0 𝑓𝑜𝑟 𝑛𝑎𝑡𝑖𝑜𝑛𝑎𝑙 𝑐𝑜𝑚𝑝𝑎𝑛𝑖𝑒𝑠

(a) . (1)

(b) Assess the overall quality of the model at 5% significance level. (2)

(c) Which of the independent variables are significant? (2)

(d) Which type of analysis of variance method would be appropriate if you wish to include both of
the independent variables? What would be the difference between this method and regression
analysis? (1)

(e) State the coefficient of determination and correlation coefficient and explain the difference
between them. (2)

(f) Estimate the price-earnings ratio of a national insurance company with a size of 3 billion
Rands. (1)
> summary(lm(pe_ratio~size+regional,data=QuestionR))
lm(formula = pe_ratio ~ size + regional, data = QuestionR)
Coefficients:
Estimate Std.Error t value Pr(>|t|)
Intercept 7.62
size -0.16 0.008
regional 1.23 0.496
---

Residual standard error: 1.303 on 10 degrees of freedom

Multiple R-squared: 0.9157, Adjusted R-squared 0.8905
F-statistic: 36.22 on 3 and 10 DF, p-value: 1.109e-05

10
𝑡29,0.05⁄ = 2.045
2

Question 9 [14 marks]

This question refers to a conceptual model that predicts reading (READ) and mathematics (MATH)
ability from observed scores of two intelligence scales, verbal comprehension (VC) and perceptual
organization (PO). The READ variable is indicated by basic word reading (BW) and reading
comprehension (RC) scores. The MATH variable is indicated by calculation (CL) and reasoning
(RE) scores. It is known that READ has an impact on the MATH variable. The following R output
gives the unstandardized estimates of the model.

(a) . (4)

(b) (2)

(c) Write down the set of …. equations for the model. Also indicate which of the equations are
from the measurement and structural part of the model and which part of the model is
significant at a 5% significance level. (5)

(d) Give a description of the measurement model part of the full SEM. That is, how are the latent
constructs being measured? (1)

(e) …. (2)

> summary(fit, fit.measures=TRUE)

Number of observations 200

χ2 = 8.63, p = 0.12, RMSEA = 0.057, SRMR = 0.017

Parameter estimates:
Estimate Std.err Z-value P(>|z|)
Latent variables:
READ =~
BW 1.000
RC 1.350 0.100
MATH =~
CL 1.000
RE 1.050 0.070
Regressions:
READ ~
VC 0.480 0.070
PO 0.040 0.050
MATH ~
VC 0.550 0.060
PO 0.160 0.050
READ 0.786 0.021

Variances:
BW 79.550 10.300

11
RC 5.280 11.960
CL 64.220 8.980
RE 36.770 7.880
READ 69.100 10.440
MATH 56.980 9.210

Python For Data Science PDF
100% (8)
Python For Data Science PDF
30 pages
Business Statistics Final OSA (A)
No ratings yet
Business Statistics Final OSA (A)
11 pages
00 Main Pt1to6
No ratings yet
00 Main Pt1to6
815 pages
General Checkpoints
No ratings yet
General Checkpoints
791 pages
II PU Statistics Preparatory Question Papers
100% (1)
II PU Statistics Preparatory Question Papers
10 pages
Important Questions For PU1 STATISTICS
No ratings yet
Important Questions For PU1 STATISTICS
14 pages
Business Statistics Question Bank 2023-24
No ratings yet
Business Statistics Question Bank 2023-24
29 pages
Assignment of Introduction To Statistics
No ratings yet
Assignment of Introduction To Statistics
3 pages
Sta 226
No ratings yet
Sta 226
5 pages
Ali Ali Ali Ali Ali
100% (1)
Ali Ali Ali Ali Ali
11 pages
Course Code: Qtt509 COURSE TITLE: Statistical Analysis For Decision Making
No ratings yet
Course Code: Qtt509 COURSE TITLE: Statistical Analysis For Decision Making
12 pages
Statistics MCQ
100% (1)
Statistics MCQ
15 pages
A Paper - DS Final Exam With Solutions 2018-20 (3) (Repaired)
100% (1)
A Paper - DS Final Exam With Solutions 2018-20 (3) (Repaired)
7 pages
Genmath1 Sample W
No ratings yet
Genmath1 Sample W
22 pages
Homework 1
0% (1)
Homework 1
8 pages
2
No ratings yet
2
20 pages
CS001-B03 - Exploratory Data Analysis 20
No ratings yet
CS001-B03 - Exploratory Data Analysis 20
7 pages
BADM End Term
No ratings yet
BADM End Term
11 pages
Analysis For Power System State Estimation
No ratings yet
Analysis For Power System State Estimation
9 pages
MR Project Group 6
No ratings yet
MR Project Group 6
8 pages
Chapter 1 Exam Review - Graphical Displays of Data SOLUTIONS
No ratings yet
Chapter 1 Exam Review - Graphical Displays of Data SOLUTIONS
8 pages
Mba 2 Sem 15mng201 Business Research Methods 2017
No ratings yet
Mba 2 Sem 15mng201 Business Research Methods 2017
3 pages
STA3022Test2 2018
No ratings yet
STA3022Test2 2018
7 pages
Stat 412 - M - 2022
No ratings yet
Stat 412 - M - 2022
21 pages
2006 Grad Dip AS I
No ratings yet
2006 Grad Dip AS I
13 pages
P.S.G.R.Krishnammal College For Women
No ratings yet
P.S.G.R.Krishnammal College For Women
4 pages
This Is Only For Practice and Will Not Be Graded
No ratings yet
This Is Only For Practice and Will Not Be Graded
5 pages
DAPv9d Mac2011
No ratings yet
DAPv9d Mac2011
36 pages
Dec 2012 Q-QM
No ratings yet
Dec 2012 Q-QM
12 pages
Bda
No ratings yet
Bda
24 pages
MVA Assignment 1
No ratings yet
MVA Assignment 1
5 pages
Lecture 6 (Data Analysis and Interpretation)
No ratings yet
Lecture 6 (Data Analysis and Interpretation)
18 pages
Jeannette Moriak: The Racers' Conjectures
100% (2)
Jeannette Moriak: The Racers' Conjectures
5 pages
STA3022Test2 2023 v2
No ratings yet
STA3022Test2 2023 v2
6 pages
Statastics MQP II Pu 2023-24
No ratings yet
Statastics MQP II Pu 2023-24
8 pages
Analytical Techniques Final OSA
No ratings yet
Analytical Techniques Final OSA
8 pages
Chapter-3 Without Answer
No ratings yet
Chapter-3 Without Answer
6 pages
Statistic Homework (Assignment) 9.0
No ratings yet
Statistic Homework (Assignment) 9.0
12 pages
Mba Data Analytics
No ratings yet
Mba Data Analytics
6 pages
Business Statistics Final OSA
No ratings yet
Business Statistics Final OSA
8 pages
Assignment 1 Statistik
No ratings yet
Assignment 1 Statistik
15 pages
Jam Session Stat140 November 2024
No ratings yet
Jam Session Stat140 November 2024
6 pages
MBA Integrated WINTER 2020
No ratings yet
MBA Integrated WINTER 2020
3 pages
8614 Quiz
No ratings yet
8614 Quiz
14 pages
Business Stats Q and A
No ratings yet
Business Stats Q and A
17 pages
DA Answer-Key
No ratings yet
DA Answer-Key
12 pages
Final Exam in Statistics
No ratings yet
Final Exam in Statistics
7 pages
Rekapitulacija NIR - Sve
No ratings yet
Rekapitulacija NIR - Sve
23 pages
Lecture 12 (Data Analysis and Interpretation
No ratings yet
Lecture 12 (Data Analysis and Interpretation
16 pages
Theoretical Questions - Answer MM
No ratings yet
Theoretical Questions - Answer MM
12 pages
Sta1008-Sample Test
No ratings yet
Sta1008-Sample Test
6 pages
Kleinbaum-Klein2012 Chapter ParametricSurvivalModels
No ratings yet
Kleinbaum-Klein2012 Chapter ParametricSurvivalModels
73 pages
Midterm Answer
No ratings yet
Midterm Answer
5 pages
Business Stats
No ratings yet
Business Stats
11 pages
Linear Statistical Models
No ratings yet
Linear Statistical Models
7 pages
National Institute of Technology, Tiruchirappalli MBA Trimester Examination, Basic Data Analytic Marathon Exam
No ratings yet
National Institute of Technology, Tiruchirappalli MBA Trimester Examination, Basic Data Analytic Marathon Exam
22 pages
2 MQP 40
No ratings yet
2 MQP 40
5 pages
II PU STATISTICSudupi
No ratings yet
II PU STATISTICSudupi
4 pages
Chapter 3 Unit Test
No ratings yet
Chapter 3 Unit Test
6 pages
Tentamen #1 - Data Analytics and Visualization - 2020-2021
No ratings yet
Tentamen #1 - Data Analytics and Visualization - 2020-2021
6 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Assignment DMBI 2
No ratings yet
Assignment DMBI 2
2 pages
Reliability Psychometrics
No ratings yet
Reliability Psychometrics
7 pages
Statistics For Management - Assignments Solved
No ratings yet
Statistics For Management - Assignments Solved
9 pages
Practical 1: A. Write A Program For Obtaining Descriptive Statistics of Data
No ratings yet
Practical 1: A. Write A Program For Obtaining Descriptive Statistics of Data
53 pages
Module3 DSV Notes
No ratings yet
Module3 DSV Notes
29 pages
Medinasd Elizabethdd
No ratings yet
Medinasd Elizabethdd
44 pages
Ass2 AIML
No ratings yet
Ass2 AIML
3 pages
Les4e Alq 04ac
No ratings yet
Les4e Alq 04ac
17 pages
ME-Tut 9
0% (1)
ME-Tut 9
1 page
Ansari Bradley
No ratings yet
Ansari Bradley
9 pages
PORTFOLIO
No ratings yet
PORTFOLIO
91 pages
Flashcards - Analysis and Interpretation of Data - CIE Biology A-Level
No ratings yet
Flashcards - Analysis and Interpretation of Data - CIE Biology A-Level
45 pages
3 Choosing Semivariogram
No ratings yet
3 Choosing Semivariogram
24 pages
Module Lesson 4 - MP
No ratings yet
Module Lesson 4 - MP
8 pages
Data Modification and Predictive Analytics - MCQ - 1 - 2
No ratings yet
Data Modification and Predictive Analytics - MCQ - 1 - 2
24 pages
Unit-5 BRM
No ratings yet
Unit-5 BRM
10 pages
Mixed
No ratings yet
Mixed
3 pages
Exam 1 ReviewV5
No ratings yet
Exam 1 ReviewV5
5 pages
14 Aos1221
No ratings yet
14 Aos1221
37 pages
DA Unit 1 Updated
No ratings yet
DA Unit 1 Updated
19 pages
Confidence Intervals: Calculating The Confidence Interval
No ratings yet
Confidence Intervals: Calculating The Confidence Interval
5 pages
Amsterdam + Berlin Schedule & Curriculum Edorer Business Analytics & Data Science Bootcamp
No ratings yet
Amsterdam + Berlin Schedule & Curriculum Edorer Business Analytics & Data Science Bootcamp
14 pages
Joseph Bigtask Prostat
No ratings yet
Joseph Bigtask Prostat
11 pages
Problemas c.3
No ratings yet
Problemas c.3
9 pages
Pengaruh Kualitas Sumber Daya Manusia Ukuran Usaha
No ratings yet
Pengaruh Kualitas Sumber Daya Manusia Ukuran Usaha
10 pages
Jurnal MPO 7
No ratings yet
Jurnal MPO 7
6 pages
Module 8 Part 3
No ratings yet
Module 8 Part 3
6 pages

STA3022F Exam June 2013

Uploaded by

STA3022F Exam June 2013

Uploaded by

UNIVERSITY OF CAPE TOWN

DEPARTMENT OF STATISTICAL SCIENCES

INTERNAL EXAMINERS: A/Prof S Lubbe, Dr Ş Er TOTAL MARKS: 100

SECTION A: EXPLORATORY METHODS [Available marks: 51]

Helpful Enjoy Branch close Keeps me Many Competitive

(d) Construct a CA map for the contingency table above. (5)

(g) Why do customers tend to choose Nedbank? (1)

(e) What is the advantage of the varimax rotation? (1)

(f) Interpret and name the four factors. (8)

[,1] [,2] [,3] [,4]

ANSWER EACH SECTION IN A SEPARATE BOOK

Question 6 [15 marks]

(a) Evaluate the hit-rate. (3)

(c) Write down the discriminant functions. (2)

Discriminant Analysis Results

Prior probabilities of groups:

Coefficients of linear discriminants:

Sample Sizes of Each Category

high low med

Mahalanobis Distances Between the Groups

Question 7 [12 marks]

In the data set, 310 firms were observed.

> rpartfit <- rpart(formula, data=datatowork)

node), split, n, loss, yval, (yprob)

The below model was fitted to data on 32 insurance companies where

(c) Which of the independent variables are significant? (2)

Residual standard error: 1.303 on 10 degrees of freedom

Question 9 [14 marks]

> summary(fit, fit.measures=TRUE)

Number of observations 200

χ2 = 8.63, p = 0.12, RMSEA = 0.057, SRMR = 0.017

You might also like