0% found this document useful (0 votes)
30 views

Random Numbers

Uploaded by

xojeje7914
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Random Numbers

Uploaded by

xojeje7914
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 99

Unit 5 Random Number

Generations (7 Hours)

1
Contents of the unit
• Random Numbers and its properties, Pseudo Random Numbers,
Methods of generation of Random Number, Tests for Randomness-
uniformity and independence , Random Variate Generation

2
Random Numbers
• Random numbers are numbers that occur in a sequence such that
two conditions are met: (1) the values are uniformly distributed over
a defined interval or set, and (2) it is impossible to predict future
values based on past or present ones.
• Random numbers are important in statistical analysis and probability
theory.

3
Random Numbers
• Properties of Random Numbers
• A sequence of random numbers, R1, R2, R3, ... Rn must have two
important properties:
• Uniformity: They are equally probable every where
• Independence: The current value of a random variable has no relation
with the previous values

4
Random Numbers
• Properties of Random Numbers
• Some consequences of the uniformity and independence properties
• If the interval (0,1) is divided into n sub-intervals of equal length, the
expected number of observations in each interval is N/n where N is
the total number of observations. Note that N has to be sufficiently
large to show this trend.
• The probability of observing a value in a particular interval is
independent of the previous values drawn.

5
Random Numbers
• Random numbers are a necessary basic ingredient in the simulation of
almost all discrete systems.

• Most computer languages have a subroutine, object or function that


will generate a random number.

• Similarly simulation languages generate random numbers that are


used to generate event times and other random variables

6
Random Number Tables
• A table of numbers generated in an unpredictable, haphazard that are
uniformly distributed within certain interval are called random
number table.

• The random number in random table exactly obey two random


numbers properties: uniformity and independence to random number
generated form table also called true random number numbers.
• Table of random numbers are used to create a Random sample.

7
Random Number Tables
• A random number table is also called random sample table.
• There are many physical devices or process that can be used to
generate a sequence of uniformly distributed random numbers.

8
Pseudo Random Numbers
• Pseudo means false, so false random numbers are being generated.
• The goal of any generation scheme is to produce a sequence of
numbers between 0 and 1 which simulates or imitates, the ideal
properties of uniform distribution and independence as closely as
possible.
• When generating pseudo-random numbers, certain problems or
errors can occur.
• Some examples of errors includes the following

9
Pseudo Random Numbers
1. The generated numbers may not be uniformly distributed
2. The generated numbers may be discrete-valued instead continuous
valued.
3. The mean of the generated numbers may go too high or too low
4. The variance of generated numbers may too high or low
5. There may be dependence

10
Properties of Good random Numbers
Generators
• Usually, random numbers are generated by a digital computers as part of
the simulation.
• Numerous methods can be used to generate the random numbers.
• In selecting among these methods, there are some important
considerations.
1. The routine should be fast: The total cost can be managed by selecting
a computationally efficient method of random-number generation.
2. The routine should be portable to different computers, and ideally to
different programming languages. This is desirable so that the
simulation program produces the same result whenever it is executed
11
Properties of Good random Numbers
Generators
• The routine should have a sufficiently long cycle. The cycle length, or period
represents the length of the random number sequence before previous numbers
begin to repeat themselves in an earlier order.
• The random numbers should be replicable. Given the starting point, it should be
possible to generate the same set of random numbers, completely independent
of the system that is being simulated.
• This is helpful for debugging purpose and is a means of facilitating comparisons
between systems

• Most important, and as indicated previously, the generated random numbers


should closely approximate the ideal statistical properties of uniformity and
independence
12
Methods to generate Random Numbers
• Linear Congruential Method : The linear congruential method, initially
proposed by Lehrer [1951], produces a sequence of integers X1, X2,….
between zero and m-1 according to the following recursive relationship:
• Xi+1 = (aXi +c) mod m, i=0,1,2…….. (1)
• The initial value of X0 is called the seed, a is called the constant
multiplier, c is the increment, and m is the modulus.
• Case 1: If c≠0 then the form is called the mixed congruential method
• Case 2: If C=0, then the form is known as the multiplicative congruential
method. The selection of the values for a, c, m and x0 drastically affects
the statistical properties and the cycle length
13
Methods to generate Random Numbers
• Here random integers are being generated rather than random
numbers.
• These random integers should appear to be uniformly distributed on
the integers zero to m.
• Random numbers Ri between 0 and 1 can be generated by setting
• Ri = i= 1,2….
• Note: The numbers generated from equation (1) assume values only
from the set I = {,0,1/m,2/m,……(m-1)/m}. Thus each Ri is discrete on
I, instead of continuous on the interval [0,1]
14
Methods to generate Random Numbers
• This approximation appears to be of little consequence if the modulus m
is very large integer .[ Values such as m = 231-1 and m = 248 are in
common use in generators appearing in many simulation languages].
• By maximum density is meant that the values assumed by Ri , i=1,2,…
leave no large gaps on [0,1].
• Second, to help achieve maximum density, and to avoid cycling in
practical applications, the generators should have the largest possible
period.
• Maximum period can be achieved by the proper choice of a, c, m and X0

15
Methods to generate Random Numbers
1. For m a power of 2 say m = 2b and c , the longest possible period P
= m = 2b, which is achieved whenever c is relatively prime to m and
a = 1+4k where k is an integer.
2. For m a power of 2, say m = 2b and c =0, the longest possible
period is P = m/4 = 2b-2, which is achieved if the seed X0 is odd and
if the multiplier a is given by a = 3+8k or a = 5+8k for some
k=0,1,2..
3. For m a prime number and c = 0, the longest possible period is P =
m-1, which is achieved whenever the multiplier a has the property
that the smallest integer k such that ak-1 is divisible by m is k = m-1
16
Methods to generate Random Numbers
• Speed and efficiency in using the generator on a digital computer is
also a selection consideration.
• Speed and efficiency are aided by use of a modulo, m, which is either
a power of 2 or close to a power of 2.
• Since most digital computers use a binary representation of numbers,
the modulo or remaindering , operation of equation (1) can be
conducted efficiently when the modulo is a power of 2.

17
Example 1
• Use the linear congruential method to generate a sequence of
random numbers with X0 = 7, a = 17, c = 43 and m = 100.
• Solution:
• Here, the integer values generated will all be between zero and 99
because of the value of the modulus.
• Xi+1 = (aXi+c) mod m
• X0 = 27
• X1 =(a*X0+c) mod m = (17*27+43) mod 100 = 2
• R1 = 2/100 = 0.02
18
• X2 = (a*X1+c) mod m = (17*2+43) mod 100 = 77
• R2 = 77/100 = 0.77
• X3 = (a*X2+c) mod m = (17*77+43) mod 100 = 52
• R3 = 52/100 = 0.52
• X4 = (a*X3+c) mod m = (17*52+43) mod 100 = 27
• R4 = 27/100 = 0.27
• X5 = (a*X4+c) mod m = (17*27+43) mod 100 = 2

19
Example 2
• Using the multiplicative congruential method, find the period of the
generator for a = 13, m = 26 = 64 and X0 = 1,2,3 and 4. When the seed
is 1 or 3, the sequence has period 16. However, a period of length 8 is
achieved when the seed is 2 and a period of length 4 occurs when the
seed is 4
• Solution:
• Here m = 26 = 64 and c = 0. The maximum period of therefore P =
m/4 = 16.
• This period is achieved by using odd seeds, X0 = 1 and X0 = 3.
• Note that a = 13 is of the form 5+8k with k = 1 as is required to
achieve maximum period.
20
Example 2
• When X0 = 1, the generated sequence assumes values from the set
{1,5,9,13,…53,57,61}.
• The gaps in the sequence of generated random numbers Ri are quite
large, such as gap gives rise to concern about the density of the
generated sequence.
• The generator in example 3 is not viable for any application-its period
is too short and its density is insufficient.
• However, the example shows that importance of properly choosing a,
c , m and X0.

21
Example 3
• Let X0 = 63, a = 19, c = 0 and m = 102 = 100, and generate a sequence of
random integers using equation (1).
• Solution:
• X0 = 63
• X1 = (19*63) mod 100 = 97
• X2 = (19*97) mod 100 = 43
• X3 = (19*43) mod 100 = 17
• When m is a power of 10, say m = 10b, the modulo operation is
accomplished by saving the b rightmost digits.
• By analogy, the modulo operation is most efficient for binary computers
when m = 2b for some b>0
22
Example 4
• Let a = 75 = 16,807, c = 0, and m = 231-1 = 2,147,483,647 (a prime
number). These choices satisfy the conditions that ensure a period of
P = m-1 (over 2 billion). Further, specify the seed X0 = 123, 457. the
first few numbers generated are as follows:
• X1 = (16,807* 123, 457) mod 2,147,483,647 =

23
Random number generation using mid square
method
• This method was proposed by Van Neumann. In this method, we have a
seed and then the seed is squared and its midterm is fetched as the
random number.
• Consider we have a seed having N digits we square that number to get a
2N digits number if it doesn’t become 2N digits we add zeros before the
number to make it 2N digits.
• A good algorithm is basically the one which does not depend on the
seed and the period should also be maximally long that it should almost
touch every number in its range before it starts repeating itself as a rule
of thumb remember that longer the period more random is the
number.
24
The main problem of this method is that 0 can
Be generated in the sequence

25
Testing for Randomness
• The desirable properties of random numbers – uniformity and
independence to ensure that these desirables properties are
achieved, a number of tests can be performed.
• The tests can be placed into two categories according to the
properties of interest.
• Testing for uniformity.
• Testing for independence.

26
Testing for Randomness
• The desired properties of random numbers are uniformity and
independence.
• So, the test of random numbers means uniformity and independence
test.
• There are different types of test used for these purposes.
• They are as follows
1. Frequency Test: Uses the Kolmogorov Smirnov (KS) or Chi-square
test to compare the distribution of the set of numbers generated to
a uniform distribution.

27
Testing for Randomness
• 2 Runs Test: Tests the runs up and down or runs above or below the
mean by comparing the actual value to expected value.

• 3 Auto Correction Test: Tests the correlation between numbers and


compares the sample correlation to the expected correlation of zero.

• Gap Test: Counts the number of digits that appear between repetition
of a particular digit and then uses KS test to compare with the
expected size of gaps

28
Testing for Randomness
• Poker Test: Treats numbers group together as a poker hand. Then the
hands obtained are compared to what is expected using the Chi-
square test.

29
Frequency Test
• Kolmogorov Smirnov Test
• Chi-Square Test
• Both of these tests measure the degree of agreement between the
distribution of a sample of generated random numbers and the
theoretical distribution.
• Both tests are based on the null hypothesis of no-significant
difference between the sample distribution and theoretical
distribution.

30
Frequency Test
• 1. The Kolmogorov- Smirnov (KS) test: This test compares the
continuous cdf F(x) of the uniform distribution with the empirical cdf
SN(x) of the sample of N observations.
• By definition,
• F(x) = x , 0≤x ≤ 1.

• If the sample from the random number generators is R1, R2,….Rn


then the empirical cdf SN(x) is defined by
• SN(x) =

31
Frequency Test
• The Kolmogorov test is based on the largest absolute deviation
between F(x) and SN(x) over the range of the random variable- that is
based on the statistic.
• D = max|(F(x)- SN(x)|
• The sampling distribution of D is known. It is obtained from table.
• The steps are a follows:

• The null hypothesis: It assumes no difference between the observed


and theoretical distribution.
32
Frequency Test
• Step 1: Rank the data from the smallest to largest.
• Let Ri denotes the ith smallest observation, so that
• R1≤R2 ≤R3≤…. ≤ RN
• Step 2: Compute
• D+ = max{ - Ri} for i=1,2,…N
• D- = max {Ri - } for i=1,2,,,N
• Step 3: Compute D = max(D+,D-)
• Step 4: Locate the critical value Dα from table for the specified level α
and the given sample size N

33
Frequency Test
• Step 5: If the sample statistics D is greater than the critical value Dα,
the null hypothesis that the data are a sample from a uniform
distribution is rejected.
• If D<=Dα, conclude that no difference has been detected between the
true distribution of {R1, R2,….RN} and the uniform distribution.

34
Example
• Suppose that the five numbers 0.44, 0.81.0.14,0.05, 0.93 are generated aid
it is desired to perform a test for uniformity by using the Kolmogorov –
Smirnov test with the level of significance α = 0.05.
• Solution: Given random numbers are: 0.44, 0.81.0.14,0.05, 0.93
• No of random numbers N = 5
• Step 1: Ranking random numbers from smaller to larger
• 0.05, 0.14, 0.44, 0.81. 0.93
• Step 2: Computing D+ and D- as
• D+ = max{ - Ri} for i=1,2,…N
• D- = max {Ri - } for i=1,2,,,N
• using following tables
35
i 1 2 3 4 5
Ri 0.05 0.14 0.44 0.81 0.93
i/N 0.20 0.40 0.60 0.80 1.00
i/N-Ri 0.15 0.26 0.16 --- 0.07
Ri- (i-1)/N 0.05 … 0.04 0.21 0.13

Now, D+ = max{ - Ri} = max(0.15,0.26,0.16,0,.07) = 0.26


D- = max(Ri- (i-1)/N) = {0.05, 0.04,,0.21,0.13) = 0.21

36
• Step 3: computer D = max(D+,D-)
• = max(0.26,0.21) = 0.26
• Step 4: The critical value of D from table for α = 0.05 and N = 5 is
0.565.

• Step 5: Since computed value D is 0.26 which is less than


0.565(tabulated critical value) 0.565 hence the hypothesis that the
distribution of the generated random numbers is the uniform
distribution is not rejected.

37
39

Table A.8 Kolmogorov-Smirnov Critical Values

38
40

EXAMPLE 3.5
Q.N > Suppose that the five numbers 0.44 , 0.81, 0.14, 0.05, 0.93 were
generated, and it is desired to perform a test for uniformity using
the Kolmogorov-Smirnov test with a level of significance a of 0.05.
Solution
First, the numbers must be ranked from smallest to largest
i.e. 0.05 , 0.14 , 0.44 , 0.81 , 0.93
Then ,

The computations for D+, namely { i /N -R(i) } and for


D-, namely { R(i ) - ( i - l ) / N } ,

39
41
https://collegenote.pythonanywhere.cSoimmulation and Modeling / [Chapter 5] Nipun Thapa

R(i) 0.05 0.14 0.44 0.81 0.93

i/N 0.2 0.4 0.6 0.8 1.00

i/N – R(i) 0.15 0.26 0.16 _ 0.07

R(i) – (i-1)/N 0.05 _ 0.04 0.21 0.13

The statistics are computed as D+ = 0.26 and D- = 0.21.

Therefore,
D = max{0.26, 0.21} = 0.26.
40
42

The critical value of D, obtained from Table A.8 for a = 0.05


and N= 5, is 0.565.

Since the computed value, 0.26, is less than the


tabulated critical value, 0.565, the hypothesis of no
difference between the distribution of the generated
numbers and the uniform distribution is not rejected.

41
43

Example 3.6
• Suppose that the five numbers 0.24 , 0.80, 0.11,
0.05,
0.93 were generated, and it is desired to perform a test for
uniformity using the Kolmogorov-Smirnov test with a level
of significance a of 0.01.

42
44

Example 3.7
• Suppose that the four numbers 0.80, 0.14, 0.05, 0.5 were
generated, and it is desired to perform a test for uniformity
using the Kolmogorov-Smirnov test with a level
of significance a of 0.10.

43
45

Example 3.8
• Suppose that the seven numbers 0.44 , 0.81, 0.14, 0.05,
0.93, 0.01, 0.02 were generated, and it is desired
to perform a test for uniformity using
the Kolmogorov- Smirnov test with a level of
significance a of 0.05.

44
46

The Chi-square Test


• The chi-square test uses the sample statistic

Where,
Oi is the observed number in the i-th class,
Ei is the expected number in the i-th class, and
n is the number of classes.
For the uniform distribution, Ei the expected number in each class
is given by:
Ei = N/n
for equally spaced classes, where N is
the total number of observations. It can
X 02
approximately the be
chi-square
shown
distribution
that with the
n - 1 degrees
sampling
of freedom 45
47

The Chi-square Test


Algorithm
Step 1: Determine Order Statistics
R1<=R2<=…….Rn
Step 2: Divided Range Rn – R1 in n equidistant intervals [ai,bi], such
that each interval has at least 5 observations.
Step 3: Calculate

Step 4: Determine for significant level a , X 2a,n-1

46
48

47
49

Example 3.8
Q.N > Use the chi-square test with α = 0.05 to test whether
the data shown below are uniformly distributed.
0.34 0.83 0.96 0.47 0.79 0.99 0.37 0.72 0.06 0.18 0.90
0.76 0.99 0.30 0.71 0.17 0.51 0.43 0.39 0.26 0.25 0.79
0.77 0.17 0.23 0.99 0.54 0.56 0.84 0.97 0.89 0.64 0.67
0.82 0.19 0.46 0.01 0.97 0.24 0.88 0.87 0.70 0.56 0.56
0.82 0.05 0.81 0.30 0.40 0.64 0.44 0.81 0.41 0.05 0.93
0.66 0.28 0.94 0.64 0.47 0.12 0.94 0.52 0.45 0.65 0.10
0.69 0.96 0.40 0.60 0.21 0.74 0.73 0.31 0.37 0.42 0.34
0.58 0.19 0.11 0.46 0.22 0.99 0.78 0.39 0.18 0.75 0.73 0.79
0.29 0.67 0.74 0.02 0.05 0.42 0.49, 0.49 0.05 0.62 0.78

48
49
50

Solution

Above Table contains the essential computations for chi square test. The
test uses n = 10 intervals of equal length, namely [0.0, 0.1), [0.1, 0.2), . . . , [0.9,
1.0). The value of X2 is 3.4.
Here degree of freedom is n-1=10-1=9 and α=0.05. The
tabulated value

of X2 0.05, 9 =16.9.Since X 2 is much smaller than the tabulated value of chi square,

the null hypothesis of a uniform distribution is not rejected.

0
50
51

Both the Kolmogorov-Smirnov and the chi-


square test are acceptable for testing the uniformity
of a sample of data, provided that
the sample size is large.
However, the Kolmogorov-Smirnov test is the more powerful of the two
and is recommended. Furthermore, the Kolmogorov-Smirnov test
can be applied to small sample sizes, whereas the chi-square is valid
only for large samples, say N>=50.

51
57

Test for independence includes the


three types of tests as given
below:
1) Autocorrelation Test tests the correlation between
numbers and compares the sample correlation to
the expected correlation of zero.
2) Gap test Counts the number of digits that
appear between repetitions of particular digit and then
uses the Kolmogorov-Smirnov test to compare with the
expected size of gaps,
3) Poker test: Treats numbers grouped together as a
poker hand. Then the hands obtained are compared to
what is expected using the chi-square test.
58

Tests for Autocorrelation


The tests for autocorrelation are concerned with the
dependence between numbers in a sequence. As an example, consider
the following sequence of numbers:
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28
0.83
0.93 0.99 0.15 0.33 0.35 0.91 0.41 0.60
0.27
0.75 0.88 0.68 0.49 0.05 0.43 0.95 0.58
0.19
0.36 0.69 0.87
From a visual inspection, these numbers appear random, and they would
probably pass all the tests presented to this point. However, an examination of the
5th, 10th, 15th (every five numbers beginning with the fifth), and so on indicates a
very large number in that position.
Now, 30 numbers is a rather small sample size to reject a random-number
generator, but the notion is that numbers in the sequence might be related. In this
particular section, a method for determining whether such a relationship exists is
described. The relationship would not have to be all high numbers. It is possible to
59

Tests for Autocorrelation


Autocorrelation Test is a
statistical test that determines whether
a random number
generator is producing independent random
number in a sequence. The test for the auto
correlation is concerned
with the dependence between numbers in a
sequence. The test computes the auto
correlation between every m numbers (m
is also known as lag) starting with ith index.
The variables involved in this test are:
• m is the lag, the space between the number being tested.
• i is the index or number from we start.
• N is the number of random numbers generated.

60

Tests for Autocorrelation


• Now the autocorrelation between
Ri, Ri+m, Ri+2m,.……Ri+(M+1)m is computed as

Now the test Statics is where


61

Tests for Autocorrelation


Q.N.> Test whether the 3rd, 8th, 13th, and so on, numbers in the
sequence at the beginning of this section are auto-correlated.
(Use a = 0.05.) Here, i = 3 (beginning with the third number), m =
5 (every five numbers), N = 30 (30 numbers in the sequence).
Solution:
First we calculate the value of M using the condition
i + (M+1)m<=N
since i=3, m=5, and N=30
we have,
3 + (M +1)5 <=
30.
i.e. 3+5M+5<=30
I.e. 5M<=22
i.e.
M<=22/5
Hence M=4
62

Then, ρ35 = 1/( 4 + 1)[ (0.23)(0.28) + (0.28)(0.33) +


(0.33)(0.27) + (0.27)(0.05) + (0.05)(0.36) ] –
0.25 = -0.1945

And
σ35= √ (13(4) + 7) / 12(4 + 1) = 0.1280
Then, the test statistic assumes the value
Z0 = -0.1945/0.1280 = -1.516
Now, the critical value is
Z0.025 = 1.96 (Zα/2 is taken in this test)
Therefore, the hypothesis of independence cannot be rejected on the
basis of this test.
63

0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83 0.93
0.99 0.15 0.33 0.35 0.91 0.41 0.60 0.27 0.75 0.88
0.68 0.49 0.05 0.43 0.95 0.58 0.19 0.36 0.69 0.87
Then, ρ35 = 1/( 4 + 1)[ (0.23)(0.28) + (0.28)(0.33) + (0.33)(0.27) +
(0.27)(0.05) + (0.05)(0.36) ] – 0.25 = -0.1945
64

0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83


0.93 0.99 0.15 0.33 0.35 0.91 0.41 0.60 0.27
0.75 0.88 0.68 0.49 0.05 0.43 0.95 0.58 0.19
0.36 0.69 0.87
Q.N.> Test whether the 2nd, 8th, 14th, and so on, numbers
in the sequence at the beginning of this section are auto-
correlated. (Use a = 0.05.) Here, i = 2 (beginning with the
second number), m = 6 (every six numbers), N = 30 (30
numbers in the sequence).
65

0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28


0.83 0.93 0.99 0.15 0.33 0.35 0.91 0.41
0.60 0.27 0.75 0.88 0.68 0.49 0.05 0.43
0.95 0.58 0.19 0.36 0.69 0.87
Q.N.> Test whether the 6th, 10th, 14th, and so on, numbers
in the sequence at the beginning of this section are auto-
correlated. (Use a = 0.05.) Here, i = 6 (beginning with the
fifth number), m = 4 (every five numbers), N = 30
(30 numbers in the sequence).
66

0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28


0.83 0.93 0.99 0.15 0.33 0.35 0.91 0.41
0.60 0.27 0.75 0.88 0.68 0.49 0.05 0.43
0.95 0.58 0.19 0.36 0.69 0.87
Q.N.> Test whether the 5th, 10th, 15th, and so on, numbers
in the sequence at the beginning of this section are auto-
correlated. (Use a = 0.05.) Here, i = 5 (beginning with the
fifth number), m = 5 (every five numbers), N = 30
(30 numbers in the sequence).
67

0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28


0.83 0.93 0.99 0.15 0.33 0.35 0.91 0.41
0.60 0.27 0.75 0.88 0.68 0.49 0.05 0.43
0.95 0.58 0.19 0.36 0.69 0.87
Q.N.> Test whether the 2th, 12th, 22th, and so on, numbers
in the sequence at the beginning of this section are auto-
correlated. (Use a = 0.05.) Here, i = 2 (beginning with the
fifth number), m = 10 (every five numbers), N = 30
(30 numbers in the sequence).
68

Gap test
The gap test is used to determine the significance of the
intervalbetween the recurrences of the same
digit. A gap of length x occurs between the
recurrences of some specified digit.
The following example illustrates the length of gaps
associated with the digit 3:

4, 1, 3, 5, 1, 7, 2, 8, 2, 0, 7, 9, 1, 3, 5, 2, 7, 9, 4, 1, 6, 3 ,
3, 9, 6,
3, 4, 8, 2, 3, 1, 9, 4, 4, 6, 8, 4, 1, 3.

There are 7 three’s are there. Thus only


six gaps can occur. The first gap is of length 10 and second
gap of length 7
and third gap of length zero. And so
on. Similarly the gap
associated with other digits can be
calculated. The theoretical probability of first gap (of length 10
69

Gap test
The probability of a particular gap
length can be determined by a Bernoulli trail.

If we are only concerned with digits between 0 and 9, then

The theoretical frequency distribution for


randomly ordered digits is given by
70

Gap test
1. Specify the CDF (Cumulative Distributive frequency)
from theoretical frequency distribution given by,
𝐹(𝑥) = 1 − 0.9𝑥+1
Based on the selected class interval
2. Arrange the observed sample of gaps in
cumulative distribution with the same class
3. Find 𝐷, the maximum deviation between 𝐹(𝑥) and 𝑆𝑁(𝑥)
equation, as
𝐷= | 𝐹(𝑥) − 𝑆𝑁(𝑥) |
Where SN (x) = 𝑁𝑜.𝑜𝑓 𝑔𝑎𝑝𝑠 ≤ x
𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑔𝑎𝑝𝑠

4. Determine the critical value of 𝐷𝛼 from


the table for the specified value of α and sample
5. size 𝑁.<(KS
𝐷𝛼, table)
Null hypothesis is not rejected.
If 𝐷𝑐𝑎𝑙
71

Example 3.11
Q.N->Based on the frequency with which gaps
occur, analyze the 110 digits below to test
whether they are independent. Use α =
0.05.
4, 1, 3, 5, 1, 7, 2, 8, 2, 0, 7, 9, 1, 3, 5, 2, 7, 9 4, 1, 6, 3, 3, 9,
6, 3, 4, 8, 2, 3, 1, 9, 4, 4, 6, 8, 4, 1, 3, 8, 9, 5, 5, 7, 3, 9, 5, 9,
8, 5, 3, 2, 2, 3, 7, 4, 7, 0, 3, 6, 3, 5, 9, 9, 5, 5 5, 0, 4, 6, 8, 0,
4, 7, 0, 3, 3, 0, 9, 5, 7, 9, 5, 1, 6, 6, 3, 8, 8, 8, 9, 2, 9, 1, 8, 5,
4, 4, 5, 0, 2, 3, 9, 7, 1, 2, 0, 3, 6, 3
72

Solution
The number of gaps is given by the number of data values
minus the number of distinct digits, or 110 —10 = 100 in the
example. The numbers of gaps associated with the various
digits are as follows:

Digit 0 1 2 3 4 5 6 7 8 9
# of Gaps 7 8 8 17 10 13 7 8 9 13
73

Relative Cum. Theoretic al


Gap Length Frequency Frequency Frequency Frequency |F(x) - SN(x)|
S(X) F(X)
0-3 35 0.35 0.35 0.3439 0.0061
4-7 22 0.22 0.57 0.5695 0.0005
8-11 17 0.17 0.74 0.7176 0.0224
12-15 9 0.09 0.83 0.8147 0.0153
16-19 5 0.05 0.88 0.8784 0.0016
20-23 6 0.06 0.94 0.9202 0.0198
24-27 3 0.03 0.97 0.9497 0.0223
28-31 0 0.00 0.97 0.9657 0.0043
32-35 0 0.00 0.97 0.9775 0.0075
36-39 2 0.02 0.99 0.9852 0.0043
40-43 0 0.00 0.99 0.9903 0.0003
44-47 1 0.01 1.00 0.9936 0.0064
74

The critical value of D is given by D0.05 = 1.36 / √100 = 0.136

Since
D = max |F(x) - SN(x) | = 0.0224

is less than D0.05,

we do not reject the hypothesis of independence on the basis of this test.

If 𝐷𝑎𝑐𝑙 < 𝐷𝛼 ,Null hypothesis is not rejected.


75

Example 3.12 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.05.

Gap Length Frequency


0-4 25
5-9 15
10-14 10
15-19 3
20-24 2
25-29 0
30-34 5
35-39 10
40-44 30
76

Example 3.13 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.05.

Gap Length Frequency


0-2 25
3-5 20
6-8 15
9-11 3
12-14 12
15-17 0
18-20 5
21-23 15
24-26 30
77

Example 3.14 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.2.

Gap Length Frequency


0-5 15
6-10 20
11-15 5
16-20 3
21-25 12
26-30 15
78

Example 3.15 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.05.

Gap Length Frequency


0-10 5
11-20 2
21-30 15
31-40 3
41-50 10
51-60 0
61-70 15
71-80 15
81-90 10
79

Example 3.16 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.05.

Gap Length Frequency


0-9 15
10-19 20
20-29 15
30-39 3
40-49 10
80

Example 3.17 : Gap test


Q.N.> Based on the frequency with which gaps occur, analyze the digits below to
test whether they are independent. Use α = 0.05.

Gap Length Frequency


0-1 25
1-3 15
4-6 10
7-9 3
10-12 2
13-15 0
81

Poker Test
 The Poker Test is the test for independence based on the
frequency with which certain digits are repeated with
in a series of numbers.
 This test not only tests for the randomness of the
sequence of numbers, but also the digits comprising of
each of the numbers.
77
78
79
80
81
82

Poker Test
 The expected value of each of the combination of digits
in a number is compared with the observed value
by means of the chi-square test for independence.
 The acceptance is done if the observed value of
chi- square sums for all the possible combinations of
digits is less than the acceptable value for
the given degree of freedom at the specified
confidence interval.
83

Poker Test
 This test gets its name from a game of called
cards poker
 This test not only tests the randomness of the sequence
of numbers, but also the digits comprising of each
number
 Every random number of five digits or every sequence
of five digits is treated as poker hand.
84

Poker Test
 71549 are five different digits
 55137 would be pair
 33669 would be two pairs
 55513 would be three of a kind
 44477 would be a full house
 77774 would be four of a kind
 88888 would be five of a kind
• The occurrence of five of a kind is rare.
85

Poker Test
• In 10,000 random and independent numbers of five
digits each, you may be expect the following distribution
of various combinations.

Five different digits 3024 or 30.24%


pairs 5040 or 50.40 %
Two-pairs 1080 or 10.80 %
Three of a kinds 720 or 7.20 %
Full houses 90 or 0.90 %
Four of a kinds 45 or 0.45 %
Five of a kinds 1 or 0.01 %
86

Poker Test
 Poker Test - based on the frequency with which
certain digits are repeated.
Example:
0.255 0.577 0.331 0.414 0.828 0.909
0.303
0.001...
Note: a pair of like digits appear in each number
generated.
87

Poker Test
 Frequency with certain digits are repeated in a series
of numbers
 Example

• 0.255, 0.577, 0.414, 0.828, 0.909, 0.303, 0.001


 Pair of like digits generated

 For three digits: three possibilities


•  All different
• All equal
• One pair of like digits Given a fixed digit, this digit different

P(exactly one pair)  (0.1)(0.9) 


3

0.27
 2 Given a fixed digit, this digit is the same
no. of possibilities
88

Poker Test
• P(three different digits)
 P(second different from first) P(third different from first and second)
 (0.9)(0.8)  0.72

• P(three like digits)


 P(second digit same as first) P(third digit same as first and second)
 (0.1)(0.1)  0.01

Poker test:

• Measure observed frequency for the three cases


Compute expected frequency Ei
(probabilities*1000) Perform chi-square test
89

Poker Test
In 3-digit numbers, there are only 3 possibilities.
P(3 different digits) =
= P(2nd diff. from 1st) * P(3rd diff. from 1st & 2nd)
= (0.9) (0.8) = 0.72

P(3 like digits) =


= P (2nd digit same as 1st) * P(3rd digit same as 1st)
= (0.1) (0.1) = 0.01

P(exactly one pair) = 1 - 0.72 - 0.01 = 0.27


90

Example 3.18
Q.N.> A sequence of 1000 three-digit numbers has been
generated and an analysis indicates that 680 have three
different digits, 289 contain exactly one pair of like digits,
and 31 contain three like digits. Based on the poker test,
are these numbers independent ? Let α = 0.05. Test these
numbers using poker test for three digits.
91

Solution
Combination, i Observed Expected (Oi-Ei)
Frequency, Frequency (Oi-Ei)2 / Ei
(Oi) (Ei)
Three Different digit 680 0.72X1000=72 -40 2.22
0
Three Like digit 31 21 44.10
0.01X1000=1
0
Exactly one pair 289 0.27X1000=270 19 1.33

1000 1000
47.65

The appropriate degrees of freedom are one less than the number of class
intervals. Since
47.65 > X2 0.05,2 = 5.99 (tabulated value), the independence of the numbers is
rejected on the basis of this test. Here 2 or n-1 is the degree of freedom since
there are only 3 (n) classes.
92

Example 3.19
Q.N.> A sequence of three-digit numbers has been
generated and an analysis indicates that 380 have three
different digits, 389 contain exactly one pair of like
digits, and 231 contain three like digits. Based on the poker
test,
are these numbers independent ? Let α = 0.05. Test these
numbers using poker test for three digits.
93

Example 3.20
Q.N.> A sequence of three-digit numbers has been
generated and an analysis indicates that 320 have three
different digits, 420 contain exactly one pair of like
digits, and 160 contain three like digits. Based on the poker
test,
are these numbers independent ? Let α = 0.05. Test these
numbers using poker test for three digits.
94

Example 3.21
Q.N.> A sequence of three-digit numbers has been
generated and an analysis indicates that 300 have three
different digits, 500 contain exactly one pair of like
digits, and 200 contain three like digits. Based on the poker
test,
are these numbers independent ? Let α = 0.05. Test these
numbers using poker test for three digits.
95

In four digit number, there are five different


possibilities
P(four different digits)
= 4c4 x 10/10 x 9/10 x 8/10 x 7/10 = 0.504
P (one pair)
= 4c2 x10/10x1/10x9/10x8/10=0.432
P (two pair)
= 4c2/2 x 10/10 x 1/10 x 9/10 x 1/10 = 0.027
P (three digits of a kind)
= 4c3 x 10/10 x 1/10 x 1/10 x 9/10 =
0.036
P (four digits of a kind)
= 4c4 x 10/10 x 1/10 x 1/10 x 1/10 =
0.001
96

Example 3.22 (TU 2067/ 10mars)


Q.N.> Explain the independence test. A sequence of 1000
four digit numbers has been generated and an
analysis indicates the following combinations and
frequencies.
97

solution
98

Example 3.23 (TU 2072/ 10 marks)


Combination i Observed Frequency 0 i


Four different digit 565
One pair 392
Two pair 17
Three like digits 24
Four like digits 2
1000
99

Example 3.24 (TU 2073/ 10 marks)

You might also like