negative variances

Hello, 

It seems possible to have negative variances due to numerical inaccuracies.  This is because nanops.py, line 120 does not take the absolute value of the result.  Having negative values will cause std() to return NaN when it should be 0.

The code below should [probabilistically] recreate the problem.  It could also be turned into a unit test. 

Thanks!  

from pandas import DataFrame
import numpy as np

random_repeated_rows = np.array( [np.random.random((10000,)),] \* 10  )
my_var = DataFrame( random_repeated_rows ).var()

len( my_var[ my_var < 0 ] )                                                                           # returns a negative slightly less than half of the time 
np.min( DataFrame( random_repeated_rows ).var() )                          # returns a tiny negative -9.8686491077791697e-16
np.min( DataFrame( random_repeated_rows ).values.var(axis=0) ) # returns 0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

negative variances #1090

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

negative variances #1090

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions