Forecasting Report Lab 1
Forecasting Report Lab 1
Question 1. Using following formula 4-10, compute the Z value for that distribution when X =
JX - (the third digit of your student ID). For example, a student with the student ID as 2053128
will have X=123.
X = 579 – 5 = 574
X−μ 574−5 78.051
Z= = =−1.191
σ 3.4
Question 2. Now, using your intuition, determine how many standard deviations (i.e., what
multiple of JY) X is from your mean. Using the procedure at formula 4-11 find the p-value of the
Z value you found.
- The z-value equals to -1.191. It means X is 1.191 standard deviations below the mean.
NormX Norm2
- Most of the points fall along the straight line - The points also align well with the straight
=> A good fit to a normal distribution. line but have slightly more deviation
compared to NormX, especially at the lower
percentiles where some points are farther
from the line.
- The AD value is 0.242, which is relatively - The AD value is 0.411, which is higher than
low. that of NormX.
=> The data closely follows a normal => A slightly worse fit to the normal
distribution. distribution compared to NormX.
- The P-value of 0.770 is relatively high. - The P-value of 0.341 is lower than that of
=> H 0 cannot be rejected (the data follows a NormX, but still high enough.
normal distribution). => H 0 of normality cannot be rejected.
=> The NormX plot shows a good fit with => While still fairly close to normal, Norm2
minor deviations at the tails. shows more deviation from the straight
line, particularly at the extreme values.
Question 3. Does the plot match the plots you drew in the previous question?
- The A-D value and p-value of two plots are - High A-D value = 9.539 and very low p-
all show enough evidence to conclude that the value < 0.005.
data follows a normal distribution. => A poor fit to the normal distribution. This
dataset presents significant deviation from
normality.
- Any deviations from the straight line in - The deviations from the straight line in C10
these plots are minor and occur mostly at the are much more pronounced. The data shows a
extreme ends (tails). The central portion of clear "S-curve" pattern, with significant
the data is closely aligned with the normal deviations across the entire range, especially
distribution. in the tails, confirming it does not follow a
normal distribution.
=> These datasets are normally distributed => The dataset is not normally distributed
with slight deviations at the tails. at all, as shown by the curvature in the plot
and the poor AD statistic and p-value.
Question 4. Describe the distribution of “NormX”; in terms of N (the number of data values),
Mean, Median, StDev, Minimum, Maximum, Q1, and Q3 peaks, skewness and outliers.
Statistics
Variab N SE Minimu Media Maximu
N Mean StDev Q1 Q3
le * Mean m n m
500 578.88 0.09983 7.0595 574.23 578.88 583.50
NormX 0 556.276 605.061
0 1 69 3 8 8 6
Question 5. Does the data exhibit specific patterns or significant spikes at various lags? What is
your conclusion regarding this series?
The data exhibit specific patterns or significant spikes at various lags:
+ Lag 14 (ACF = 0.038): A moderate positive autocorrelation.
+ Lag 29 (ACF = 0.029): A mild spike in autocorrelation.
+ Lag 52 (ACF = 0.027): Another moderate spike.
+ Lag 105 (ACF = 0.034): A slightly higher correlation.
=> These spikes might suggest some periodic or cyclical pattern in the data, though they
are not very strong.
Conclusion:
+ The time series has weak autocorrelation because the overall ACF values across most
lags are low. The data points are largely independent from each other, with limited
influence from prior values.
=> The time series is mostly random or weakly autocorrelated, with occasional weak
periodicities.
+ The slight spikes at lags like 14, 29, 52, and 105 could suggest weak periodicity. The
series does not show strong long-term trends or patterns based on the provided ACF
values.
=> Suggest strong dependencies between time points.
Question 6. In the Ljung-Box Q test table, are there any significant p-values suggesting the data
is not random? Describe a test for the hypothesis that the autocorrelations up to the highest lag
are equal to zero.
There are no significant p-values (all are greater than 0.05) that there is no evidence to
reject the null hypothesis, suggesting the data is random.
Test:
+ Hypotheses:
Null Hypothesis (H0): The autocorrelations of the time series are equal to zero up
to the highest lag.
Alternative Hypothesis (H1): At least one autocorrelation is not equal to zero.
+ Test Method
Calculate the Ljung-Box Q Statistic:
h
^p k 2
Q=n(n+2) ∑
k=1 n−k
Where:
n : the sample size
^pk : the sample autocorrelation at lag k
h : the number of lags being tested
Decision Rule:
If the p-value < 0.05: Reject H0, the series is not random.
If the p-value ≥ 0.05: Fail to reject H0, the series behaves randomly.