0% found this document useful (0 votes)
13 views9 pages

Forecasting Report Lab 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views9 pages

Forecasting Report Lab 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1.

Minitab and Describing Data

Question 1. Using following formula 4-10, compute the Z value for that distribution when X =
JX - (the third digit of your student ID). For example, a student with the student ID as 2053128
will have X=123.
X = 579 – 5 = 574
X−μ 574−5 78.051
Z= = =−1.191
σ 3.4
Question 2. Now, using your intuition, determine how many standard deviations (i.e., what
multiple of JY) X is from your mean. Using the procedure at formula 4-11 find the p-value of the
Z value you found.
- The z-value equals to -1.191. It means X is 1.191 standard deviations below the mean.

P ( X ≤574 )=P ( X−μ


σ

574−578.051
3.4 )=P ( Z ≤−1. 191)=0. 11683
- p-value = 0.11683.

2. Working with Random Variables


Question 1. Describe the distribution of NormX. Does this variable look like it was drawn from
a normal distribution? Why is that?
*The distribution of NormX:
- The mean of this distribution is 578.9
- The standard deviation is 7.060
- The number of data points is 5000
* This variable look like it was drawn from a normal distribution because:
- The histogram is symmetric around the mean.
- The data points are distributed around the mean (578.9) with frequencies decreasing as values
move farther from the center, creating a smooth bell-shaped curve.
- The distribution follows the empirical rule, where about 68% of the data falls within one
standard deviation from the mean, 95% within two standard deviations, and so on.
Question 2. How does the probability plot for NormX compare to the plot for Norm2? How well
do the observations fall along a straight line in each of the two plots?

NormX Norm2
- Most of the points fall along the straight line - The points also align well with the straight
=> A good fit to a normal distribution. line but have slightly more deviation
compared to NormX, especially at the lower
percentiles where some points are farther
from the line.

- The AD value is 0.242, which is relatively - The AD value is 0.411, which is higher than
low. that of NormX.
=> The data closely follows a normal => A slightly worse fit to the normal
distribution. distribution compared to NormX.

- The P-value of 0.770 is relatively high. - The P-value of 0.341 is lower than that of
=> H 0 cannot be rejected (the data follows a NormX, but still high enough.
normal distribution). => H 0 of normality cannot be rejected.

=> The NormX plot shows a good fit with => While still fairly close to normal, Norm2
minor deviations at the tails. shows more deviation from the straight
line, particularly at the extreme values.
Question 3. Does the plot match the plots you drew in the previous question?

NormX, Norm 2 C10


- Both of data points of these plots closely - The plot for C10 deviates significantly from
follow a straight line, these datasets are the straight line. There is a noticeable
approximately normally distributed. Although curvature in the data points, especially at the
there are minor deviations, especially at the extremes.
tails, the data fits well along the line. => C10 does not follow a normal distribution.

- The A-D value and p-value of two plots are - High A-D value = 9.539 and very low p-
all show enough evidence to conclude that the value < 0.005.
data follows a normal distribution. => A poor fit to the normal distribution. This
dataset presents significant deviation from
normality.

- Any deviations from the straight line in - The deviations from the straight line in C10
these plots are minor and occur mostly at the are much more pronounced. The data shows a
extreme ends (tails). The central portion of clear "S-curve" pattern, with significant
the data is closely aligned with the normal deviations across the entire range, especially
distribution. in the tails, confirming it does not follow a
normal distribution.

=> These datasets are normally distributed => The dataset is not normally distributed
with slight deviations at the tails. at all, as shown by the curvature in the plot
and the poor AD statistic and p-value.

Question 4. Describe the distribution of “NormX”; in terms of N (the number of data values),
Mean, Median, StDev, Minimum, Maximum, Q1, and Q3 peaks, skewness and outliers.

Statistics
Variab N SE Minimu Media Maximu
N Mean StDev Q1 Q3
le * Mean m n m
500 578.88 0.09983 7.0595 574.23 578.88 583.50
NormX 0 556.276 605.061
0 1 69 3 8 8 6

 N = 5000: The dataset consists of 5,000 data points.


 Mean = 578.881: The average value of all data points in the dataset is 578.881.
 Median = 578.888: This is the middle value, 50% of the data is below 578.888.
 StDev = 7.05953: The data points typically deviate from the mean by 7.05953 units. It
means most of the data points are within 7.05953 units above or below the mean.
 Minimum = 556.276: This is the smallest value in the dataset.
 Maximum = 605.061: This is the largest value in the dataset.
=> The range (difference between maximum and minimum) is 605.061 - 556.276 =
48.785, indicating the total spread of the data.
 Q1 = 574.238: 25% of the data falls below 574.238.
 Q3 = 583.506: 75% of the data falls below 583.506.
 Skewness: The data may have very small skewness because the mean (578.881) is close
to the median (578.888) that the distribution is approximately symmetric.
 Outliers: Any values significantly lower than Q1 - 1.5 * IQR or higher than Q3 + 1.5 *
IQR might be considered outliers.
+ IQR = Q3 - Q1 = 583.506 - 574.238 = 9.268
+ Lower Bound = Q1 - 1.5 * IQR = 574.238 - 1.5 * 9.268 = 560.336
+ Upper Bound = Q3 + 1.5 * IQR = 583.506 + 1.5 * 9.268 = 597.408
=> Any values below 560.336 or above 597.408 might be potential outliers.
Autocorrelations
La
ACF T LBQ
g
0.01641 1.1
1 1.35
53 6
0.01036 0.7
2 1.89
53 3
0.00404 0.2
3 1.97
49 9
0.00935 0.6
4 2.41
60 6
0.00713 0.5
5 2.66
79 0
0.00965 0.6
6 3.13
87 8
0.01162 0.8
7 3.81
91 2
0.00464 0.3
8 3.91
83 3
0.01416 1.0
9 4.92
29 0
0.00069 0.0
10 4.92
42 5
0.01252 0.8
11 5.71
52 8
0.01402 0.9
12 6.69
90 9
0.00142 0.1
13 6.70
05 0
0.03848 2.7
14 14.13
04 2
0.00889 0.6
15 14.53
18 3
0.01098 0.7
16 15.13
35 7
0.01370 0.9
17 16.08
87 7
0.01452 1.0
18 17.14
52 2
0.01907 1.3
19 18.96
68 4
0.00117 0.0
20 18.97
21 8
0.00476 0.3
21 19.08
36 4
0.00457 0.3
22 19.19
45 2
0.00250 0.1
23 19.22
29 8
0.01606 1.1
24 20.52
56 3
0.00317 0.2
25 20.57
93 2
0.00467 0.3
26 20.68
55 3
0.00110 0.0
27 20.69
26 8
0.00633 0.4
28 20.89
77 5
0.02911 2.0
29 25.15
17 5
0.00736 0.5
30 25.42
77 2
0.00065 0.0
31 25.43
02 5
0.00793 0.5
32 25.74
29 6
0.01181 0.8
33 26.45
13 3
0.00652 0.4
34 26.66
44 6
0.02065 1.4
35 28.81
96 5
0.00082 0.0
36 28.81
74 6
37 0.00806 0.5 29.14
12 7
0.00467 0.3
38 29.25
88 3
0.01242 0.8
39 30.03
12 7
0.00721 0.5
40 30.29
07 1
0.02044 1.4
41 32.40
13 4
0.00298 0.2
42 32.44
86 1
0.00196 0.1
43 32.46
90 4
0.00664 0.4
44 32.69
35 7
0.00746 0.5
45 32.97
47 2
0.00705 0.5
46 33.22
41 0
0.00367 0.2
47 33.29
16 6
0.00125 0.0
48 33.29
60 9
0.00636 0.4
49 33.50
29 5
0.01269 0.8
50 34.31
12 9
0.02275 1.6
51 36.93
22 0
0.02698 1.8
52 40.61
25 9
0.01593 1.1
53 41.89
98 2
0.00322 0.2
54 41.95
39 3
0.01285 0.9
55 42.78
77 0
0.00834 0.5
56 43.13
48 9
0.01578 1.1
57 44.39
21 1
0.02185 1.5
58 46.81
97 3
0.00782 0.5
59 47.12
28 5
0.01007 0.7
60 47.64
13 1
0.00091 0.0
61 47.64
56 6
0.02296 1.6
62 50.31
40 1
0.01479 1.0
63 51.42
70 4
0.02196 1.5
64 53.87
63 4
0.00053 0.0
65 53.87
74 4
0.02749 1.9
66 57.70
30 2
0.02269 1.5
67 60.31
94 9
0.01521 1.0
68 61.48
53 6
0.01764 1.2
69 63.06
56 3
0.01908 1.3
70 64.91
71 3
0.00008 0.0
71 64.91
92 1
0.01287 0.9
72 65.75
11 0
0.01651 1.1
73 67.14
03 5
0.00472 0.3
74 67.25
59 3
0.00916 0.6
75 67.68
66 4
0.00356 0.2
76 67.74
27 5
0.00626 0.4
77 67.94
43 4
0.01748 1.2
78 69.49
76 2
0.01254 0.8
79 70.29
23 7
0.01692 1.1
80 71.75
63 8
0.00117 0.0
81 71.76
72 8
0.00106 0.0
82 71.76
67 7
0.02647 1.8
83 75.33
46 5
0.01556 1.0
84 76.56
85 8
0.01946 1.3
85 78.49
72 6
0.00680 0.4
86 78.73
75 7
0.00386 0.2
87 78.80
53 7
0.00729 0.5
88 79.07
27 1
0.00365 0.2
89 79.14
27 5
0.00195 0.1
90 79.16
03 4
0.00492 0.3
91 79.28
37 4
0.00329 0.2
92 79.34
06 3
0.00546 0.3
93 79.49
61 8
0.01088 0.7
94 80.10
88 6
0.01529 1.0
95 81.29
36 6
0.00218 0.1
96 81.31
64 5
0.01546 1.0
97 82.53
70 8
0.00742 0.5
98 82.81
38 2
0.02165 1.5
99 85.21
31 1
10 0.01436 1.0
86.26
0 73 0
10 0.00025 0.0
86.26
1 66 2
10 0.00876 0.6
86.65
2 45 1
10 0.00604 0.4
86.84
3 36 2
10 0.01560 1.0 88.08
4 72 9
10 0.03368 2.3
93.88
5 97 4
10 0.01330 0.9
94.79
6 04 2
10 0.00362 0.2
94.85
7 67 5
10 0.00017 0.0
94.85
8 20 1
10 0.00872 0.6
95.24
9 32 1
11 0.01115 0.7
95.88
0 53 7
11 0.00661 0.4
96.10
1 06 6
11 0.00969 0.6
96.58
2 93 7
11 0.02157 1.5
98.97
3 01 0
11 0.00635 0.4
99.17
4 92 4
11 0.02219 1.5 101.6
5 49 4 9

Question 5. Does the data exhibit specific patterns or significant spikes at various lags? What is
your conclusion regarding this series?
 The data exhibit specific patterns or significant spikes at various lags:
+ Lag 14 (ACF = 0.038): A moderate positive autocorrelation.
+ Lag 29 (ACF = 0.029): A mild spike in autocorrelation.
+ Lag 52 (ACF = 0.027): Another moderate spike.
+ Lag 105 (ACF = 0.034): A slightly higher correlation.
=> These spikes might suggest some periodic or cyclical pattern in the data, though they
are not very strong.
 Conclusion:
+ The time series has weak autocorrelation because the overall ACF values across most
lags are low. The data points are largely independent from each other, with limited
influence from prior values.
=> The time series is mostly random or weakly autocorrelated, with occasional weak
periodicities.
+ The slight spikes at lags like 14, 29, 52, and 105 could suggest weak periodicity. The
series does not show strong long-term trends or patterns based on the provided ACF
values.
=> Suggest strong dependencies between time points.
Question 6. In the Ljung-Box Q test table, are there any significant p-values suggesting the data
is not random? Describe a test for the hypothesis that the autocorrelations up to the highest lag
are equal to zero.
 There are no significant p-values (all are greater than 0.05) that there is no evidence to
reject the null hypothesis, suggesting the data is random.
 Test:
+ Hypotheses:
 Null Hypothesis (H0): The autocorrelations of the time series are equal to zero up
to the highest lag.
 Alternative Hypothesis (H1): At least one autocorrelation is not equal to zero.
+ Test Method
 Calculate the Ljung-Box Q Statistic:
h
^p k 2
Q=n(n+2) ∑
k=1 n−k

Where:
n : the sample size
^pk : the sample autocorrelation at lag k
h : the number of lags being tested

squared distribution with 𝑚 degrees of freedom to obtain the p-value.


 Determine the p-value: The calculated Q statistic is then compared to a chi-

 Decision Rule:
If the p-value < 0.05: Reject H0, the series is not random.
If the p-value ≥ 0.05: Fail to reject H0, the series behaves randomly.

You might also like