Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: to_datetime with M or Y unit and non-round float #50301

Merged
merged 10 commits into from
Dec 28, 2022

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@MarcoGorelli
Copy link
Member

If non-round values aren't allowed, would this supersede #50183?

@jbrockmendel
Copy link
Member Author

If non-round values aren't allowed, would this supersede #50183?

Its only for unit="M" or unit="Y"

@MarcoGorelli MarcoGorelli self-requested a review December 16, 2022 18:50
pandas/tests/tools/test_to_datetime.py Show resolved Hide resolved
pandas/tests/tools/test_to_datetime.py Show resolved Hide resolved
pandas/_libs/tslib.pyx Outdated Show resolved Hide resolved
pandas/_libs/tslib.pyx Show resolved Hide resolved
pandas/_libs/tslib.pyx Outdated Show resolved Hide resolved
@mroeschke mroeschke added the Datetime Datetime data dtype label Dec 20, 2022
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of minor comments, other than that looks good

doc/source/whatsnew/v2.0.0.rst Outdated Show resolved Hide resolved
pandas/tests/tools/test_to_datetime.py Show resolved Hide resolved
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @jbrockmendel

pandas/tests/tools/test_to_datetime.py Show resolved Hide resolved
@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Dec 22, 2022
Comment on lines +346 to +356
if is_ym and not fval.is_integer():
# Analogous to GH#47266 for Timestamp
if is_raise:
raise ValueError(
f"Conversion of non-round float with unit={unit} "
"is ambiguous"
)
elif is_ignore:
raise AssertionError
iresult[i] = NPY_NAT
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry to block this, but is there a test that hits this?

I can see tests when val is a non-round float, but what about when it's a string containing a non-round float (e.g. '1.5'), which I believe is what would reach this branch?

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @jbrockmendel !

@mroeschke any thoughts?

@mroeschke mroeschke merged commit 83c2a5f into pandas-dev:main Dec 28, 2022
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the depr-cast_from_unit branch December 28, 2022 19:09
rebecca-palmer added a commit to rebecca-palmer/pandas that referenced this pull request Jan 18, 2023
mroeschke pushed a commit that referenced this pull request Jan 20, 2023
* add test for float to_datetime near overflow bounds

* fix float to_datetime near overflow bounds

* fix typo and formatting

* fix formatting

* fix test to not fail on rounding differences

* don't use approximate comparison on datetimes, it doesn't work

* also can't convert datetime to float

* match dtypes

* TST: don't try to use non-integer years (see #50301)

* TST: don't cross an integer

(tsmax_in_days happens to be close to an integer,
and this is a test of rounding)

* PERF: remove unnecessary copy

* add whatsnew
pooja-subramaniam pushed a commit to pooja-subramaniam/pandas that referenced this pull request Jan 25, 2023
* add test for float to_datetime near overflow bounds

* fix float to_datetime near overflow bounds

* fix typo and formatting

* fix formatting

* fix test to not fail on rounding differences

* don't use approximate comparison on datetimes, it doesn't work

* also can't convert datetime to float

* match dtypes

* TST: don't try to use non-integer years (see pandas-dev#50301)

* TST: don't cross an integer

(tsmax_in_days happens to be close to an integer,
and this is a test of rounding)

* PERF: remove unnecessary copy

* add whatsnew
QuLogic added a commit to QuLogic/pandas that referenced this pull request Feb 21, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, an temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

Since `cast_from_unit` takes an `object`, not a more specific Cython
type, remove the explicit type from the temporary `fval` variable
entirely. This will cause it to be a (64-bit) Python float, and thus not
lose precision.

Fixes pandas-dev#57051
QuLogic added a commit to QuLogic/pandas that referenced this pull request Feb 21, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

Since `cast_from_unit` takes an `object`, not a more specific Cython
type, remove the explicit type from the temporary `fval` variable
entirely. This will cause it to be a (64-bit) Python float, and thus not
lose precision.

Fixes pandas-dev#57051
QuLogic added a commit to QuLogic/pandas that referenced this pull request Feb 22, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

So widen the explicit type of the temporary `fval` variable to (64-bit)
`double`, which will not lose precision.

Fixes pandas-dev#57051
QuLogic added a commit to QuLogic/pandas that referenced this pull request Feb 22, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

So widen the explicit type of the temporary `fval` variable to (64-bit)
`double`, which will not lose precision.

Fixes pandas-dev#57051
QuLogic added a commit to QuLogic/pandas that referenced this pull request Mar 27, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

So widen the explicit type of the temporary `fval` variable to (64-bit)
`double`, which will not lose precision.

Fixes pandas-dev#57051
mroeschke pushed a commit that referenced this pull request Mar 27, 2024
In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since #50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

So widen the explicit type of the temporary `fval` variable to (64-bit)
`double`, which will not lose precision.

Fixes #57051
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…as-dev#57548)

In Pandas 1.5.3, the `float(val)` cast was inline to the
`cast_from_unit` call in `array_with_unit_to_datetime`. This caused the
intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts,
but with explicit type `cdef float`, which defines a _Cython_ float.
This type is 32-bit, and causes a loss of precision, and a regression in
parsing from 1.5.3.

So widen the explicit type of the temporary `fval` variable to (64-bit)
`double`, which will not lose precision.

Fixes pandas-dev#57051
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants