BUG: is ser[:2] with Int64Index positional or label-based #49612

jbrockmendel · 2022-11-10T03:27:15Z

ser = pd.Series([3, 4, 5])

>>> ser[:2]  # <- should this behave like loc or iloc?
>>> ser.loc[:2]
>>> ser.iloc[:2]

Focusing only on Integer-dtype indexes, i.e. Int64Index, UInt64Index, RangeIndex, and NumericIndex[inty].

ser[2] and ser[[2]] each treat the indexing as label-based instead of positional. But ser[:2] treats the slice as positional instead of label-based.

We've deprecated some of these cases in #45162, but as I go to enforce the deprecation I think I screwed up and issued the warning for too few cases. In particular, to keep the number of warnings down I excluded some cases

                elif start is None and stop is None:
                    # label-based vs positional is irrelevant
                    pass
                elif isinstance(self, ABCRangeIndex) and self._range == range(
                    len(self)
                ):
                    # In this case there is no difference between label-based
                    #  and positional, so nothing will change.
                    pass
                elif (
                    self.dtype.kind in ["i", "u"]
                    and self._is_strictly_monotonic_increasing
                    and len(self) > 0
                    and self[0] == 0
                    and self[-1] == len(self) - 1
                ):
                    # We are range-like, e.g. created with Index(np.arange(N))
                    pass

The elif start is None and stop is None case is correct, but the other two are wrong bc with .loc slicing is right-inclusive.

If I enforce the deprecation for these extra two cases, I get 555 test failures, so this isn't trivial.

The text was updated successfully, but these errors were encountered:

jbrockmendel · 2022-11-15T01:42:51Z

@jreback

jreback · 2022-11-16T10:32:26Z

@jbrockmendel your logic seems correct here. agree that s[:2] should be treated as location and warn that it's not positional

jbrockmendel · 2022-11-19T02:17:53Z

@mroeschke thoughts on what to do here?

mroeschke · 2022-11-21T19:53:50Z

I know you wanted to focus on int indexes, but It would be nice is this was positional to align with Series(..., index=non_numeric)[:int] according to this example: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#series-is-ndarray-like

Unless you had come across this and wanted to change this too

jbrockmendel · 2022-11-21T21:49:25Z

IIUC what you're suggesting would make/keep int-slicing-with-intindex inconsistent with other int-indexing-with-intindex, but possibly in a way that has fewer corner cases than the status quo?

(Long-term (i.e. 3.0) im increasingly thinking we should just deprecate Series.__getitem__ and Series.__setitem__ entirely and tell users to just use loc/iloc, so there's never any ambiguity.)

mroeschke · 2022-11-21T22:07:45Z

IIUC what you're suggesting would make/keep int-slicing-with-intindex inconsistent with other int-indexing-with-intindex, but possibly in a way that has fewer corner cases than the status quo?

True, int-slicing and int-indexing would diverge in this case for this index type, but int-slicing would be consistent regardless of index type if int-slicing was always positional. I guess there are multiple ways to cut the consistency pie (by indexer, Index type, or both), and I haven't been keeping up well in which direction we should be/are headed so I don't have super strong opinions on what feels more right.

(Long-term (i.e. 3.0) im increasingly thinking we should just deprecate Series.getitem and Series.setitem entirely and tell users to just use loc/iloc, so there's never any ambiguity.)

+10.

jbrockmendel · 2022-11-21T22:50:48Z

So I guess for 2.0 we just revert the deprecation?

mroeschke · 2022-11-22T00:15:54Z

Probably good to get more thoughts from others who have looked closer at indexing behavior @phofl @jorisvandenbossche

jbrockmendel · 2022-11-22T00:20:07Z

Actually just reverting wouldn't quite get to the consistent-ish behavior you suggested. we'd also need to change the behavior with Float64Index, which ATM always treats slicing as label-based. Doing that breaks 5 tests locally, though only one directly testing this behavior.

jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 10, 2022

jbrockmendel mentioned this issue Nov 23, 2022

API: series int-slicing always positional #49869

Closed

5 tasks

jbrockmendel mentioned this issue Dec 15, 2022

REV: revert deprecation of Series.__getitem__ slicing with IntegerIndex #50283

Merged

5 tasks

phofl closed this as completed in #50283 Dec 17, 2022

jbrockmendel mentioned this issue Jan 7, 2023

DEPR: Series.__getitem__, Series.__setitem__ #50617

Open

jbrockmendel mentioned this issue May 23, 2023

DEPR: int slicing always positional #53338

Merged

5 tasks

memeplex mentioned this issue Feb 6, 2024

DOC: Wrong generalization about slice indexing #57277

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: is ser[:2] with Int64Index positional or label-based #49612

BUG: is ser[:2] with Int64Index positional or label-based #49612

jbrockmendel commented Nov 10, 2022 •

edited

Loading

jbrockmendel commented Nov 15, 2022

jreback commented Nov 16, 2022

jbrockmendel commented Nov 19, 2022

mroeschke commented Nov 21, 2022

jbrockmendel commented Nov 21, 2022

mroeschke commented Nov 21, 2022 •

edited

Loading

jbrockmendel commented Nov 21, 2022

mroeschke commented Nov 22, 2022

jbrockmendel commented Nov 22, 2022

BUG: is ser[:2] with Int64Index positional or label-based #49612

BUG: is ser[:2] with Int64Index positional or label-based #49612

Comments

jbrockmendel commented Nov 10, 2022 • edited Loading

jbrockmendel commented Nov 15, 2022

jreback commented Nov 16, 2022

jbrockmendel commented Nov 19, 2022

mroeschke commented Nov 21, 2022

jbrockmendel commented Nov 21, 2022

mroeschke commented Nov 21, 2022 • edited Loading

jbrockmendel commented Nov 21, 2022

mroeschke commented Nov 22, 2022

jbrockmendel commented Nov 22, 2022

jbrockmendel commented Nov 10, 2022 •

edited

Loading

mroeschke commented Nov 21, 2022 •

edited

Loading