Skip to content

Series.diff() not exact for huge numbers: approximation or mistake? #2087

@ktii

Description

@ktii

Series.diff() is not exact for huge numbers. This might be due to some quick approximation or just a bug/mistake.

In [40]: a = 10000000000000000

In [41]: log10(a)
Out[41]: 16.0

In [42]: b = a + 1

In [43]: s = Series([a,b])

In [44]: s.diff()
Out[44]:
0 NaN
1 0

numpy.diff() is not to blame:

In [45]: v = s.values

In [46]: v
Out[46]: array([10000000000000000, 10000000000000001], dtype=int64)

In [47]: diff(v)
Out[47]: array([1], dtype=int64)

One less digit is fine:

In [48]: a = 1000000000000000

In [48]: log10(a)
Out[48]: 15.0

In [49]: b = a + 1

In [50]: s = Series([a,b])

In [51]: s.diff()
Out[51]:
0 NaN
1 1

Why do I need these huge numbers? Certain timestamps (in my case VMS) have this many digits (tenth of microseconds since the year 1858 if I remember correctly).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions