-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Series.diff() is not exact for huge numbers. This might be due to some quick approximation or just a bug/mistake.
In [40]: a = 10000000000000000
In [41]: log10(a)
Out[41]: 16.0
In [42]: b = a + 1
In [43]: s = Series([a,b])
In [44]: s.diff()
Out[44]:
0 NaN
1 0
numpy.diff() is not to blame:
In [45]: v = s.values
In [46]: v
Out[46]: array([10000000000000000, 10000000000000001], dtype=int64)
In [47]: diff(v)
Out[47]: array([1], dtype=int64)
One less digit is fine:
In [48]: a = 1000000000000000
In [48]: log10(a)
Out[48]: 15.0
In [49]: b = a + 1
In [50]: s = Series([a,b])
In [51]: s.diff()
Out[51]:
0 NaN
1 1
Why do I need these huge numbers? Certain timestamps (in my case VMS) have this many digits (tenth of microseconds since the year 1858 if I remember correctly).