What’s new in 2.3.1 (July 7, 2025)#

These are the changes in pandas 2.3.1. See Release notes for a full changelog including other versions of pandas.

Improvements and fixes for the StringDtype#

Most changes in this release are related to StringDtype which will become the default string dtype in pandas 3.0. See Upcoming changes in pandas 3.0 for more details.

Comparisons between different string dtypes#

In previous versions, comparing Series of different string dtypes (e.g. pd.StringDtype("pyarrow", na_value=pd.NA) against pd.StringDtype("python", na_value=np.nan)) would result in inconsistent resulting dtype or incorrectly raise (GH 60639). pandas will now use the hierarchy

object < (python, NaN) < (pyarrow, NaN) < (python, NA) < (pyarrow, NA)

in determining the result dtype when there are different string dtypes compared. Some examples:

  • When pd.StringDtype("pyarrow", na_value=pd.NA) is compared against any other string dtype, the result will always be boolean[pyarrow].

  • When pd.StringDtype("python", na_value=pd.NA) is compared against pd.StringDtype("pyarrow", na_value=np.nan), the result will be boolean, the NumPy-backed nullable extension array.

  • When pd.StringDtype("python", na_value=pd.NA) is compared against pd.StringDtype("python", na_value=np.nan), the result will be boolean, the NumPy-backed nullable extension array.

Index set operations ignore empty RangeIndex and object dtype Index#

When enabling the future.infer_string option, Index set operations (like union or intersection) will now ignore the dtype of an empty RangeIndex or empty Index with object dtype when determining the dtype of the resulting Index (GH 60797).

This ensures that combining such empty Index with strings will infer the string dtype correctly, rather than defaulting to object dtype. For example:

>>> pd.options.future.infer_string = True
>>> df = pd.DataFrame()
>>> df.columns.dtype
dtype('int64')               # default RangeIndex for empty columns
>>> df["a"] = [1, 2, 3]
>>> df.columns.dtype
<StringDtype(na_value=nan)>  # new columns use string dtype instead of object dtype

Bug fixes#

Contributors#

A total of 10 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • David Krych

  • Irv Lustig

  • Joris Van den Bossche

  • Lumberbot (aka Jack)

  • Marc Garcia

  • Matthew Roeschke

  • Pandas Development Team

  • Ralf Gommers

  • Richard Shadrach

  • jbrockmendel