Skip to content

DataFrame.from_records() should optionally convert None to NaN #893

@gerigk

Description

@gerigk

I still run frequently into problems with "NaN" and "None".
For example:
I retrieve Data from an SQL DB (postgres in my case) and then I have a list of tuples with "None" (in SQL NULL) values.
I then convert to a DF by using DataFrame.from_records()
Now, to avoid object Series it would be nice to have None automatically mapped to "NaN" in the same way as "from_csv" does.
Right now I would have to do something like DF.fillna(np.nan) and then convert the numeric columns again to their original type which is kind of ugly.

        time0 = datetime.now()
        result_dataframe = DataFrame.from_records(results, columns=columns)
        time1 = datetime.now()
        for col in result_dataframe:
            if result_dataframe.dtypes[col] == object:
                result_dataframe[col].fillna(np.nan, inplace=True)
        result_dataframe = result_dataframe.from_records(result_dataframe)
        print datetime.now() - time1, time1-time0

0:00:07.477906 0:00:01.971754

The fillna takes much longer than the construction of the DF itself. Example with 750k rows.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions