Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby error when grouping by float index with non-unique values #29189

Closed
jeybrahms opened this issue Oct 23, 2019 · 5 comments · Fixed by #29700
Closed

groupby error when grouping by float index with non-unique values #29189

jeybrahms opened this issue Oct 23, 2019 · 5 comments · Fixed by #29700
Labels
Bug Groupby Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jeybrahms
Copy link

Code Sample, a copy-pastable example if possible

pd.Series([2, 5, 6, 8], [2.0, 4.0, 4.0, 5.0]).groupby(level=0)

Problem description

This should just work instead of raising.

Expected Output

Output of pd.show_versions()

pandas : 0.26.0.dev0+593.g9d45934af
numpy : 1.17.0

@WillAyd
Copy link
Member

WillAyd commented Oct 23, 2019

Interesting find. I think this stems from some clean up in #27909 but very surprising that we would throw anything other than a Key or Index error at the below line

except (KeyError, IndexError):

@jorisvandenbossche @jbrockmendel

@WillAyd
Copy link
Member

WillAyd commented Oct 23, 2019

Traceback on master for reference

>>> pd.Series([2, 5, 6, 8], [2.0, 4.0, 4.0, 5.0]).groupby(level=0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/williamayd/clones/pandas/pandas/core/generic.py", line 7943, in groupby
    **kwargs
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/groupby.py", line 2494, in groupby
    return klass(obj, by, **kwds)
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/groupby.py", line 391, in __init__
    mutated=self.mutated,
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/grouper.py", line 607, in _get_grouper
    if is_in_obj(gpr):  # df.groupby(df['name'])
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/grouper.py", line 601, in is_in_obj
    return gpr is obj[gpr.name]
  File "/Users/williamayd/clones/pandas/pandas/core/series.py", line 1088, in __getitem__
    result = self.index.get_value(self, key)
  File "/Users/williamayd/clones/pandas/pandas/core/indexes/numeric.py", line 420, in get_value
    loc = self.get_loc(k)
  File "/Users/williamayd/clones/pandas/pandas/core/indexes/numeric.py", line 479, in get_loc
    return super().get_loc(key, method=method, tolerance=tolerance)
  File "/Users/williamayd/clones/pandas/pandas/core/indexes/base.py", line 2826, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_loc
    cpdef get_loc(self, object val):
  File "pandas/_libs/index.pyx", line 128, in pandas._libs.index.IndexEngine.get_loc
    return self._get_loc_duplicates(val)
  File "pandas/_libs/index.pyx", line 143, in pandas._libs.index.IndexEngine._get_loc_duplicates
    left = values.searchsorted(val, side='left')
TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'

@WillAyd WillAyd added Bug Regression Functionality that used to work in a prior pandas version labels Oct 23, 2019
@WillAyd WillAyd added this to the 1.0 milestone Nov 15, 2019
@WillAyd
Copy link
Member

WillAyd commented Nov 15, 2019

@jbrockmendel do you mind taking a look at this one? I think would be good to fix before releasing 1.0

@jbrockmendel
Copy link
Member

Sure, probably next week.

@WillAyd
Copy link
Member

WillAyd commented Nov 15, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants