Best Square Root Method - Algorithm - Function (Precision Vs Speed) - CodeProject
Best Square Root Method - Algorithm - Function (Precision Vs Speed) - CodeProject
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
11,064,090 members 37,158 online
home
articles
quick answers
discussions
Sign in
features
community
help
Searchforarticles,questions,tips
CPOL
Rate this:
4.85 47 votes
Info
First Posted
1 Apr 2010
Views
119,162
Downloads
1,074
Bookmarked
88 times
Square Root Methods Fast Algorithm Speed Precision computational Quake3 Fast Square Root Function Fast
Gaming
Introduction
I enjoy Game Programming with Directx and I noticed that the most called method throughout most of my
games is the standard sqrt method in the Math.h and this made me search for faster functions than the standard
sqrt. And after some searching, I found lots of functions that were much much faster but it's always a
compromise between speed and precision. The main purpose of this article is to help people choose the best
squareroot method that suits their program.
Background
In this article, I compare 14 different methods for computing the square root with the standard sqrt function as a
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
1/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
reference, and for each method I show its precision and speed compared to the sqrt method.
for(intj=0;j<AVG;j++)
{
dur.Start();
for(inti=1;i<M;i++)
RefTotalPrecision+=sqrt((float)i);
dur.Stop();
Temp+=dur.GetDuration();
}
RefTotalPrecision/=AVG;
Temp/=AVG;
RefSpeed=(float)(Temp)/CLOCKS_PER_SEC;
And for the other methods I do the same calculations, but in the end, I reference them to the sqrt.
Collapse | Copy Code
for(intj=0;j<AVG;j++)
dur.Start();
for(inti=1;i<M;i++)
TotalPrecision+=sqrt1((float)i);
dur.Stop();
Temp+=dur.GetDuration();
TotalPrecision/=AVG;
Temp/=AVG;
Speed=(float)(Temp)/CLOCKS_PER_SEC;
cout<<"Precision="
<<(double)(1abs((TotalPrecisionRefTotalPrecision)/(RefTotalPrecision)))*100<<endl;
NOTES:
1. I assume that the error in Precision whether larger or smaller than the reference is equal, that's why I use
"abs".
2. The Speed is referenced as the actual percentage, while the Precision is referenced as a decrease
percentage.
You can modify the value of M as you like, I initially assign it with 10000.
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
2/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
You can modify AVG as well, the higher it is, the more accurate the results.
Collapse | Copy Code
#defineM10000
#defineAVG10
Points of Interest
Precision wise, the sqrt standard method is the best. But the other functions can be much faster even 5 times
faster. I would personally choose Method N# 14 as it has high precision and high speed, but I'll leave it for you to
choose.
I took 5 samples and averaged them and here is the output:
According to the analysis the above Methods Performance Ranks Speed x Precision is:
NOTE: The performance of these methods depends highly on your processor and may change from one
computer to another.
The METHODS
Sqrt1
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
3/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: Babylonian Method + some manipulations on IEEE 32 bit floating point representation
Collapse | Copy Code
floatsqrt1(constfloatx)
{
union
{
inti;
floatx;
}u;
u.x=x;
u.i=(1<<29)+(u.i>>1)(1<<22);
//TwoBabylonianSteps(simplifiedfrom:)
//u.x=0.5f*(u.x+x/u.x);
//u.x=0.5f*(u.x+x/u.x);
u.x=u.x+x/u.x;
u.x=0.25f*u.x+x/u.x;
returnu.x;
}
Sqrt2
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: The Magic Number Quake 3
Collapse | Copy Code
#defineSQRT_MAGIC_F0x5f3759df
floatsqrt2(constfloatx)
{
constfloatxhalf=0.5f*x;
union//getbitsforfloatingvalue
{
floatx;
inti;
}u;
u.x=x;
u.i=SQRT_MAGIC_F(u.i>>1);//givesinitialguessy0
returnx*u.x*(1.5fxhalf*u.x*u.x);//Newtonstep,repeatingincreasesaccuracy
}
Sqrt3
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: Log base 2 approximation and Newton's Method
Collapse | Copy Code
floatsqrt3(constfloatx)
{
union
{
inti;
floatx;
}u;
u.x=x;
u.i=(1<<29)+(u.i>>1)(1<<22);
returnu.x;
}
Sqrt4
Reference: I got it a long time a go from a forum and I forgot, please contact me if you know its reference.
Algorithm: Bakhsali Approximation
Collapse | Copy Code
floatsqrt4(constfloatm)
{
inti=0;
while((i*i)<=m)
i++;
i;
floatd=mi*i;
floatp=d/(2*i);
floata=i+p;
returna(p*p)/(2*a);
}
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
4/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sqrt5
Reference: http://www.dreamincode.net/code/snippet244.htm
Algorithm: Babylonian Method
Collapse | Copy Code
floatsqrt5(constfloatm)
{
floati=0;
floatx1,x2;
while((i*i)<=m)
i+=0.1f;
x1=i;
for(intj=0;j<10;j++)
{
x2=m;
x2/=x1;
x2+=x1;
x2/=2;
x1=x2;
}
returnx2;
}
Sqrt6
Reference: http://www.azillionmonkeys.com/qed/sqroot.html#calcmeth
Algorithm: Dependant on IEEE representation and only works for 32 bits
Collapse | Copy Code
doublesqrt6(doubley)
{
doublex,z,tempf;
unsignedlong*tfptr=((unsignedlong*)&tempf)+1;
tempf=y;
*tfptr=(0xbfcdd90a*tfptr)>>1;
x=tempf;
z=y*0.5;
x=(1.5*x)(x*x)*(x*z);//Themoreyoumakereplicatesofthisstatement
//thehighertheaccuracy,hereonly2replicatesareused
x=(1.5*x)(x*x)*(x*z);
returnx*y;
}
Sqrt7
Reference: http://bits.stephanbrumme.com/squareRoot.html
Algorithm: Dependant on IEEE representation and only works for 32 bits
Collapse | Copy Code
floatsqrt7(floatx)
{
unsignedinti=*(unsignedint*)&x;
//adjustbias
i+=127<<23;
//approximationofsquareroot
i>>=1;
return*(float*)&i;
}
Sqrt8
Reference: http://forums.techarena.in/softwaredevelopment/1290144.htm
Algorithm: Babylonian Method
Collapse | Copy Code
doublesqrt9(constdoublefg)
{
doublen=fg/2.0;
doublelstX=0.0;
while(n!=lstX)
{
lstX=n;
n=(n+fg/n)/2.0;
}
returnn;
}
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
5/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sqrt9
Reference: http://www.functionx.com/cpp/examples/squareroot.htm
Algorithm: Babylonian Method
Collapse | Copy Code
doubleAbs(doubleNbr)
{
if(Nbr>=0)
returnNbr;
else
returnNbr;
}
doublesqrt10(doubleNbr)
{
doubleNumber=Nbr/2;
constdoubleTolerance=1.0e7;
do
{
Number=(Number+Nbr/Number)/2;
}while(Abs(Number*NumberNbr)>Tolerance);
returnNumber;
}
Sqrt10
Reference: http://www.cs.uni.edu/~jacobson/C++/newton.html
Algorithm: Newton's Approximation Method
Collapse | Copy Code
doublesqrt11(constdoublenumber)e
{
constdoubleACCURACY=0.001;
doublelower,upper,guess;
if(number<1)
{
lower=number;
upper=1;
}
else
{
lower=1;
upper=number;
}
Article
Browse Code
Stats
Revisions 27
while((upperlower)>ACCURACY)
{
guess=(lower+upper)/2;
if(guess*guess>number)
upper=guess;
else
lower=guess;
}
return(lower+upper)/2;
}
Alternatives
Comments 41
Tagged as
C++
Sqrt11
Reference: http://www.drdobbs.com/184409869;jsessionid=AIDFL0EBECDYLQE1GHOSKH4ATMY32JVN
Algorithm: Newton's Approximation Method
Collapse | Copy Code
Related Articles
SAPrefs Netscape
like Preferences
Dialog
XNA Snooker Club
WPF: A* search
Windows 7 / VS2010
demo app
Window Tabs
doublesqrt12(unsignedlongN)
{
doublen,p,low,high;
if(2>N)
return(N);
low=0;
high=N;
while(high>low+1)
{
n=(high+low)/2;
p=n*n;
if(N<p)
high=n;
elseif(N>p)
low=n;
else
break;
}
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
6/11
12/4/2014
WndTabs AddIn
for DevStudio
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
return(N==p?n:low);
}
Sqrt12
Reference: http://cjjscript.q8ieng.com/?p=32
Algorithm: Babylonian Method
Collapse | Copy Code
Go to top
doublesqrt13(intn)
{
//doublea=(eventuallythemainmethodwillplugvaluesintoa)
doublea=(double)n;
doublex=1;
//Forlooptogetthesquarerootvalueoftheenterednumber.
for(inti=0;i<n;i++)
x=0.5*(x+a/x);
returnx;
}
Sqrt13
Reference: N/A
Algorithm: Assembly fsqrt
Collapse | Copy Code
doublesqrt13(doublen)
{
__asm{
fldn
fsqrt
}
}
Sqrt14
Reference: N/A
Algorithm: Assembly fsqrt 2
Collapse | Copy Code
doubleinline__declspec(naked)__fastcallsqrt14(doublen)
{
_asmfldqwordptr[esp+4]
_asmfsqrt
_asmret8
}
History
1.3 (15 September 2010)
Added Method N#14 which is the best method till now
Added modified source code
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
7/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Added the average feature
License
This article, along with any associated source code and files, is licensed under The Code Project Open License
CPOL
Share
EMAIL
My Website
Follow on
Spacing Compact
Noise VeryHigh
Layout OpenAll
Go
Per page 10
Update
Member 10586125
6May14 2:17
Just have tried to make dll and import sqrt13 and sqrt14 into my c# project.
Just tested in my method computing Standard Deviation.
I have tried two versions __fastcall and __stdcall.
Result is following Math.Sqrt in c# is little bit faster.
Best methods in article sqrt13 and sqrt14 do not suite for x64.
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
8/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sign In View Thread Permalink
Member 10551010
29Apr14 21:58
Would it be possible to translate your sqrt13 and sqrt14 algorithms into GCC? It would help Linux users but
nonexperts in assembly like myself to test it and maybe use it. Thanks!
Sign In View Thread Permalink
objectivec
10z
28Apr14 9:04
Marcos Lohmann
20Nov13 5:40
another way to do it
I am using this function to calculate square root; it has a very good performance and precision.
It is based on median of lower/higher end points to reduce the number of iterations to find the answer.
At the deepest decimal positions it may run into an infinite loop, so I had to implement a break point based on
repeated lower/higher ends.
floatsqrt(floatn){
if(n<0)n=1*n;
floatlow=0,high=n,llow=high,lhigh=low,sqrt=0,res=0;
while(res!=n){
sqrt=(high+low)/2;
res=sqrt*sqrt;
if(res>n)high=sqrt;
elseif(res<n)low=sqrt;
if(llow==low&&lhigh==high){
break;
}else{
llow=low;
lhigh=high;
}
}
returnsqrt;
}
Marcos Lohmann
20Nov13 6:41
But the latest one has an issue on numbers between 0 and 1, so I have implemented a routine to multiply
the number by 100 as many times it needs to become grater than 1, then divide the result by 10 the same
amount of times; that fixes the issue.
floatsqrt1(floatn){
if(n<0)return1;
inttimes=1;
while(n<1){
n=n*100;
times++;
}
if(n<0)n=1*n;
floatlow=0,high=n,xlow=high,xhigh=low,sqrt=0,res=0;
while(res!=n){
sqrt=(high+low)/2;
res=sqrt*sqrt;
if(res>n)high=sqrt;
elseif(res<n)low=sqrt;
if(xlow==low&&xhigh==high){
break;
}else{
xlow=low;
xhigh=high;
}
}
for(inti=1;i<times;i++){
sqrt=sqrt/10;
}
returnsqrt;
}
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
9/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sign In View Thread Permalink
Conor Manning
22Feb13 9:22
Mahmoud,
Further to Peter_in_2780's excellent comment, I'd like to point out that there is another error in your analysis of
the precision of each method.
After making the changes Peter suggested, by taking the sum of all of the absolute values of correct
answer, you will also need to change what you mean by 'correct'.
In your current implementation, you define precision as how far the answer is from your reference
implementation. Unfortunately, the reference implementation is simply another approximation there's no
algorithm that can calculate the square root with complete accuracy of any given real number.
So to calculate the precision, you'll actually need to square your sqrt and see how close that is to the input. An
example should make this clear here I'll take sqrt13:
precision = abssqrt13x*sqrt13x x
Something else you might want to consider is that M=10000 isn't really a high value. Some of the methods
listed here, such as the Quake method, have constant order, so I'd expect them to do much better than
iterative methods for high inputs.
I hope you'll find the time to edit the article, because otherwise developers might be misinformed. Thanks.
modified 22Feb13 14:30pm.
Good Read
Bassam AbdulBaki
29May12 8:26
Good read, especially that magic number from Quake III. If you're also testing accuracy, you may want to add
the new optimal magic number found using the Quake III approach 0x5f37642f.
Web BM RSS Math LinkedIn
Thanks
Mahmoud Hesham El
Magdoub
5Jun12 21:33
Iam Nothing
utkarshs
28May12 7:47
speed here is compared in percentages,so what is the actual time taken by the functions?for example, how
much time would they take to calculate square root of 2 to 100 decimal places.my method takes 27 seconds
for that,but has 100% accuracy.i mean,not relative to sqrt,but actual 100% accuracy.how would you rate it?on
your scale?answer please
Sign In View Thread Permalink
Mahmoud Hesham El
Magdoub
28May12 8:41
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
10/11
12/4/2014
BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Keep it up
Iam Nothing
News
Question
Select Language
Refresh
Bug
Answer
Joke
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi
Rant
1 2 3 4 5 Next
Admin
Article Copyright 2010 by Mahmoud Hesham ElMagdoub
Everything else Copyright CodeProject, 19992014
11/11