Skip to content

to_datetime(foo, errors='coerce') does not swallow all errors #28299

Closed
@miggec

Description

@miggec

Code Sample

# this fails with ValueError in 0.25.1:
pandas.to_datetime('200622-12-31', errors='coerce')
# but returns Timestamp('2022-06-21 19:00:00') in pandas 0.23.4

# this also fails: 
pandas.to_datetime('111111-24-11', errors='coerce')

# but this does not:
pandas.to_datetime('111111-23-11', errors='coerce')

Problem description

I have some text files with malformed dates, which at one point I will process with the above code. While trying to migrate my code from 23.4 to 25.1 I got the following:

.../my_file.py in <module>
----> 1 pandas.to_datetime('200622-12-31', errors='coerce')

.../lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    206                 else:
    207                     kwargs[new_arg_name] = new_arg_value
--> 208             return func(*args, **kwargs)
    209 
    210         return wrapper

.../lib/python3.7/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin, cache)
    794             result = convert_listlike(arg, box, format)
    795     else:
--> 796         result = convert_listlike(np.array([arg]), box, format)[0]
    797 
    798     return result

.../lib/python3.7/site-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    461             errors=errors,
    462             require_iso8601=require_iso8601,
--> 463             allow_object=True,
    464         )
    465 

.../lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1982             return values.view("i8"), tz_parsed
   1983         except (ValueError, TypeError):
-> 1984             raise e
   1985 
   1986     if tz_parsed is not None:

.../lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1973             dayfirst=dayfirst,
   1974             yearfirst=yearfirst,
-> 1975             require_iso8601=require_iso8601,
   1976         )
   1977     except ValueError as e:

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

ValueError: offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24).

Expected Output

The main expectation is that an exception is not raised.

I would probably expect pandas.to_datetime('200622-12-31', errors='coerce') to return NaT, but pandas 23.4 seems to parse it into Timestamp('2022-06-21 19:00:00')

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-58-generic
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.1
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.10
pytest : 4.6.3
hypothesis : 4.7.3
sphinx : None
blosc : None
feather : 0.4.0
xlsxwriter : 1.1.8
lxml.etree : 4.3.3
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : 2.7.7 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.5.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : 0.3.0
lxml.etree : 4.3.3
matplotlib : 3.1.1
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.13.0
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.2.14
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions