Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
In [1]: import pandas as pd
In [2]: s = pd.Series([1000213, 2131232, 21312331], dtype='datetime64[s]')
In [3]: s
Out[3]:
0 1970-01-12 13:50:13
1 1970-01-25 16:00:32
2 1970-09-04 16:05:31
dtype: datetime64[s]
In [4]: p = s.astype('datetime64[ms]')
In [6]: p
Out[6]:
0 1970-01-12 13:50:13
1 1970-01-25 16:00:32
2 1970-09-04 16:05:31
dtype: datetime64[ms]
In [7]: s
Out[7]:
0 1970-01-12 13:50:13
1 1970-01-25 16:00:32
2 1970-09-04 16:05:31
dtype: datetime64[s]
In [8]: pd.testing.assert_series_equal(s, p) # Failure: Works as expected since `dtype's` are different
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[8], line 1
----> 1 pd.testing.assert_series_equal(s, p)
[... skipping hidden 2 frame]
File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/_testing/asserters.py:596, in raise_assert_detail(obj, message, left, right, diff, first_diff, index_values)
593 if first_diff is not None:
594 msg += f"\n{first_diff}"
--> 596 raise AssertionError(msg)
AssertionError: Attributes of Series are different
Attribute "dtype" are different
[left]: datetime64[s]
[right]: datetime64[ms]
In [9]: pd.testing.assert_series_equal(s, p, check_dtype=False) # I expect this to not raise, because we are asking for the dtypes to be ignored and the data as seen above is perfectly identical.
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[9], line 1
----> 1 pd.testing.assert_series_equal(s, p, check_dtype=False)
[... skipping hidden 1 frame]
File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/_testing/asserters.py:741, in assert_extension_array_equal(left, right, check_dtype, index_values, check_exact, rtol, atol, obj)
732 assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
734 if (
735 isinstance(left, DatetimeLikeArrayMixin)
736 and isinstance(right, DatetimeLikeArrayMixin)
(...)
739 # Avoid slow object-dtype comparisons
740 # np.asarray for case where we have a np.MaskedArray
--> 741 assert_numpy_array_equal(
742 np.asarray(left.asi8),
743 np.asarray(right.asi8),
744 index_values=index_values,
745 obj=obj,
746 )
747 return
749 left_na = np.asarray(left.isna())
[... skipping hidden 1 frame]
File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/_testing/asserters.py:666, in assert_numpy_array_equal.<locals>._raise(left, right, err_msg)
664 diff = diff * 100.0 / left.size
665 msg = f"{obj} values are different ({np.round(diff, 5)} %)"
--> 666 raise_assert_detail(obj, msg, left, right, index_values=index_values)
668 raise AssertionError(err_msg)
File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/_testing/asserters.py:596, in raise_assert_detail(obj, message, left, right, diff, first_diff, index_values)
593 if first_diff is not None:
594 msg += f"\n{first_diff}"
--> 596 raise AssertionError(msg)
AssertionError: Series are different
Series values are different (100.0 %)
[index]: [0, 1, 2]
[left]: [1000213, 2131232, 21312331]
[right]: [1000213000, 2131232000, 21312331000]
Issue Description
With the newly introduced datetime64
& timedelta64
time resolutions, it is possible to hold the identical data in different dtypes. So when we pass check_dtype=False
to assert_frame_equal
we expect identical data to pass and not raise an error.
Expected Behavior
In [9]: pd.testing.assert_series_equal(s, p, check_dtype=False) # Passes.
In [10]: pd.testing.assert_series_equal(s, p, check_dtype=True) # Raises error
Installed Versions
INSTALLED VERSIONS
commit : c2a7f1a
python : 3.10.10.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-76-generic
Version : #86-Ubuntu SMP Fri Jan 17 17:24:28 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.0.0rc1
numpy : 1.23.5
pytz : 2023.3
dateutil : 2.8.2
setuptools : 67.6.1
pip : 23.0.1
Cython : 0.29.33
pytest : 7.2.2
hypothesis : 6.70.1
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.12.0
pandas_datareader: None
bs4 : 4.12.0
bottleneck : None
brotli :
fastparquet : None
fsspec : 2023.3.0
gcsfs : None
matplotlib : None
numba : 0.56.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2023.3.0
scipy : 1.10.1
snappy :
sqlalchemy : 1.4.46
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None