C++ Examples - Cumulative Normal Distribution Function
C++ Examples - Cumulative Normal Distribution Function
html
C++ Examples - Cumulative Normal Distribution Function
Question:
Write a function to calculate N(x) the cumulative normal distribution with mean zero and variance
one given by the formula
Again, first we think about the stages of our program. We must choose a method to implement the
integration, test and verify it. Once it is then applied to the problem we need to decide how to deal with
the concept of infinity in this setting. First let use code up the integration method. To be able to check the
method, you should choose a simple function with a known integral so that we can compare accuracy.
Start with a new project and an empty cpp file.
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
// do nothing
return 0;
}
download
Compile and check it runs. We are going to implement Simpsons method for integration firstly on the sin
function so that we can check the results. The method is given by
(1)
Start by declaring the input parameters required by the method, including the number of steps and the
integration range, and follow with the variables used inside such as the dummy integration variable s and
the step size h. We should supply default values and initialise variables if required
// number of steps
int N=10;
// range of integration
double a=0,b=1.;
// local variables
double s,h,sum=0.;
// inialise the variables
h=(b-a)/N;
download
Next we need to add the first few terms and the last one to sum, before adding a loop to go over the rest.
// add in the first few terms
Note that we should really check that N is even as this method relies on that fact. Since we have already
taken care of the terms for 0, 1 and N, the loop must run over the terms 2 up to and including N-1. Check
your loop runs over the correct values before continuing
// loop over terms 2 up to N-1
for(int i=1;i<N/2;i++)
{
s = a + 2*i*h;
// check even terms
cout << 2*i << " " << s << endl;
s = s + h;
// check odd terms
cout << 2*i+1 << " " << s << endl;
}
download
Next we are going to change this algorithm to solve the normal distribution. Do this in 2 steps, first go
through and change all calls to the sin function by exp(-s*s/2.), then multiply the final result by 1/sqrt(2*pi).
The result (for a=0 and b=1) should be 0.341345. Now we have to deal with infinity, so how can we
integrate over infinity? For the normal distribution this is quite easy as it is an even function and we know
that the integral between 0 and infinity is one half. So simply adding one half to the result of the
integration between 0 and b gives the value of N(b) for all b. At this stage your code should look like
// add in the first few terms
sum = sum + exp(-a*a/2.) + 4.*exp(-(a+h)*(a+h)/2.);
// and the last one
sum = sum + exp(-b*b/2.);
// loop over terms 2 up to N-1
for(int i=1;i<N/2;i++)
{
s = a + 2*i*h;
sum = sum + 2.*exp(-s*s/2.);
s = s + h;
sum = sum + 4.*exp(-s*s/2.);
}
// complete the integral
sum = 0.5 + h*sum/3./sqrt(8.*atan(1.));
// output results
cout << sum << endl;
download
The last thing to take account of is what happens when x is either very large and positive or very large
and negative. The best thing to do here is use analytical results and return 0 if x is large and negative and
1 if x if large and positive. Dont forget to set the value of b to be x and choose a value of N sufficiently
large to maintain accuracy.
#include <iostream>
#include <cmath>
using namespace std;
double normalDistribution(double x)
{
if(x<-10.)return 0.;
if(x>10.)return 1.;
// number of steps
int N=2000;
// range of integration
double a=0,b=x;
// local variables
double s,h,sum=0.;
// inialise the variables
h=(b-a)/N;
// add in the first few terms
sum = sum + exp(-a*a/2.) + 4.*exp(-(a+h)*(a+h)/2.);
// and the last one
sum = sum + exp(-b*b/2.);
// loop over terms 2 up to N-1
for(int i=1;i<N/2;i++)
{
s = a + 2*i*h;
sum = sum + 2.*exp(-s*s/2.);
s = s + h;
sum = sum + 4.*exp(-s*s/2.);
}
// complete the integral
sum = 0.5 + h*sum/3./sqrt(8.*atan(1.));
// return result
return sum;
int main()
{
cout.precision(16);
cout << "N(0)=" << normalDistribution(0.) << endl;
cout << "N(1)=" << normalDistribution(1.) << endl;
cout << "N(2)=" << normalDistribution(2.) << endl;
return 0;
}
download
Here we needed to choose N=2000 to get close to double precision, meaning that a single function call is
incredibly expensive. There are other, more efficient approximations out there. If you are using a compiler
that is math library C99 compliant you will have access to the erfc function, which will give you double
precision accuracy using the formula
Unfortunately if you are using Visual Studio it is not yet implemented (might yet be in VS2013) so we
need another method. One such method is described in West (2005) and they give a VB code
implementation of the method. We are going to follow that method closely. Start with your empty program,
I have included some benchmark values in comments to check against
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
// Benchmark values of N(x)
// N(0) = 0.5
// N(1) = 0.841344746068543
// N(2) = 0.977249868051821
return 0;
}
download
where P, Q and F are polynomials, more details can be found in the examples sheet (click here). First
declare and assign some of the variables needed for the calculation
// calculate \sqrt{2\pi} upfront once
double RT2PI = sqrt(4.0*acos(0.0));
// calculate 10/\sqrt{2} upfront once
double SPLIT = 10./sqrt(2);
// enter coefficients for the polynomials P and Q
double a[] =
{220.206867912376,221.213596169931,112.079291497871,33.912866078383,6.37396220
353165,0.700383064443688,3.52624965998911e-02};
double b[] =
{440.413735824752,793.826512519948,637.333633378831,296.564248779674,86.780732
2029461,16.064177579207,1.75566716318264,8.83883476483184e-02};
// calculate the value of N at x=1
double x=1.;
download
where we have chosen x = 1 to test. Since we know what x is we dont need to consider the 3 cases. Now
calculate the polynomial expressions using a nested multiplication (important for accuracy) and we get
double NDash = exp(-x*x/2.0)/RT2PI;
double Px = (((((a[6]*x + a[5])*x + a[4])*x + a[3])*x + a[2])*x + a[1])*x +
a[0];
double Qx = ((((((b[7]*x + b[6])*x + b[5])*x + b[4])*x + b[3])*x + b[2])*x +
b[1])*x + b[0];
cout << Px << " " << Qx << " " << NDash << endl;
// output is 594.522 2272.83 0.241971
download
the output you should expect is shown. We can now estimate the value of N(x)
cout.precision(16);
cout << 1 - exp(-x*x/2.0)*Px/Qx << endl;
// output is 0.841344746068543
download
You now need to include the different cases, such as negative x, etc, and then move everything into a
function. After some testing you should arrive at an algorithm that looks something like:
#include <iostream>
#include <cmath>
using namespace std;
/*
*
A * * fast* and accurate polynomial approximation to $N(x)$ (double
precision):
*
$$ N(x)= 1-\sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}, \quad 0 \leq x\leq
\frac{10}{\sqrt{2}},$$
*
$$ N(x)= 1-N'(x)\frac{1}{F_4(x)}, \quad \frac{10}{\sqrt{2}} < x\leq
37,$$
*
$$ N(x)= 1, \quad x > 37,$$
*
$$N(x)=1-N(-x),\quad x<0.$$
*
Here
*
$$
*
P(x) = \sum_{i=0}^6 a_i x^i, \quad \quad Q(x) = \sum_{i=0}^7 b_i x^i
*
$$
*
and the coefficients are:
*
\begin{tabular}{lc||lc}
*
\hline
*
$a_0$ & 220.206867912376& $b_0$&440.413735824752 \\
*
$a_1$ & 221.213596169931& $b_1$&793.826512519948 \\
*
$a_2$ & 112.079291497871& $b_2$& 637.333633378831\\
*
$a_3$ & 33.912866078383& $b_3$&296.564248779674 \\
*
$a_4$ & 6.37396220353165& $b_4$&86.7807322029461 \\
*
$a_5$ & 0.700383064443688& $b_5$& 16.064177579207\\
*
$a_6$ & 0.0352624965998911& $b_6$&1.75566716318264 \\
*
& & $b_7$& 0.0883883476483184\\
*
\end{tabular}
*
For the function $F_4(x)$ we use the recurrence relation
*
$$
*
F_0(x) = x + \frac{13}{20}
*
$$
*
$$
*
F_{i}(x) = x + \frac{5-i}{F_{i-1}(x)} \quad \quad \text{for }i=1,2,3,4
*
$$
*
*
On entry x is a real value, and we return double precision estimate of
the cummulative normal distribution
*/
double normalDistribution(double x)
{
// calculate \sqrt{2\pi} upfront once
static const double RT2PI = sqrt(4.0*acos(0.0));
// calculate 10/\sqrt{2} upfront once
static const double SPLIT = 10./sqrt(2);
static const double a[] =
{220.206867912376,221.213596169931,112.079291497871,33.912866078383,6.37396220
353165,0.700383064443688,3.52624965998911e-02};
static const double b[] =
{440.413735824752,793.826512519948,637.333633378831,296.564248779674,86.780732
2029461,16.064177579207,1.75566716318264,8.83883476483184e-02};
const double z = fabs(x);
// Now N(x) = 1 - N(-x) = 1-\sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}
// so N(-x) = \sqrt{2\pi}N'(x)\frac{P(x)}{Q(x)}
// now let \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = Nz
// Therefore we have
//
Nxm = N(x) = \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = Nz if x<0
//
Nxp = N(x) = 1 - \sqrt{2\pi}N'(z)\frac{P(x)}{Q(z)} = 1-Nz if x>=0
double Nz = 0.0;
//
return x>=0.0 ? 1-Nz : Nz;
int main()
{
cout.precision(16);
cout << "N(0)=" << normalDistribution(0.) << endl;
cout << "N(1)=" << normalDistribution(1.) << endl;
cout << "N(2)=" << normalDistribution(2.) << endl;
return 0;
}
download
References
West, Graeme. Better approximations to cumulative normal functions. Wilmott Magazine, 9:7076,
2005. URLhttps://lyle.smu.edu/~aleskovs/emis/sqc2/accuratecumnorm.pdf.