Skip to content

BUG: Multi-indexed Series to Numpy fails when na_value supplied #45774

Closed
@DamianBarabonkovQC

Description

@DamianBarabonkovQC

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd

index = pd.MultiIndex.from_tuples([(0, "a"), (0, "b")])
x = pd.Series([1, 2], index=index)

print(x.to_numpy(dtype="float")) # [1. 2.]
print(x.to_numpy(dtype="float", na_value=np.nan)) # KeyError: 1

Issue Description

When converting a multiindexed series to a numpy array, it works correctly when no na_value is specified. But when an na_value is specified, even if the series has no NA values, it errors.

Digging into this issue, when na_value is specified, a branch is taken which sets the na_value in the numpy array.
result[self.isna()] = na_value

However, the numpy result array cannot understand the multiindex of the self.isna() multi-indexed array.

The Pull Request #45775 is one possibility to address this issue.

Expected Behavior

import numpy as np
import pandas as pd

index = pd.MultiIndex.from_tuples([(0, "a"), (0, "b")])
x = pd.Series([1, 2], index=index)

print(x.to_numpy(dtype="float")) # [1. 2.]
print(x.to_numpy(dtype="float", na_value=np.nan)) # [1. 2.]

Installed Versions

The pd.show_versions() produces an error, but below are the versions of the relevant packages.

python 3.10.1 h1248fe1_2_cpython conda-forge

pandas 1.5.0.dev0+268.gbe8d1ec880 dev_0 # Installed locally
numpy 1.22.0 pypi_0 pypi

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolateMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions