-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Boolean 2d array for __setitem__ with DataFrame incorrect results #18582
Labels
Bug
Dtype Conversions
Unexpected or buggy dtype conversions
Error Reporting
Incorrect or improved errors from pandas
Indexing
Related to indexing on series/frames, not to indexes themselves
Milestone
Comments
the 2d ndarray input hits a path where a 1d array is expected, so this is a bug Idiomatically and what the
Here's a patch to implement the correct conversion (note might fail some existing tests). Alternatively we could raise
|
would love a PR! |
jreback
added
Bug
Difficulty Advanced
Dtype Conversions
Unexpected or buggy dtype conversions
Error Reporting
Incorrect or improved errors from pandas
Indexing
Related to indexing on series/frames, not to indexes themselves
labels
Dec 1, 2017
jreback
changed the title
BUG: Boolean ndarray indexing gives incorrect results?
BUG: Boolean 2d array for __setitem__ with DataFrame incorrect results
Dec 1, 2017
Ha! Seems to be my weekend for getting stuck into pandas so I'll give it a crack... |
thanks @dhirschfeld |
dhirschfeld
added a commit
to dhirschfeld/pandas
that referenced
this issue
Dec 5, 2017
4 tasks
dhirschfeld
added a commit
to dhirschfeld/pandas
that referenced
this issue
Dec 9, 2017
dhirschfeld
added a commit
to dhirschfeld/pandas
that referenced
this issue
Dec 30, 2017
dhirschfeld
added a commit
to dhirschfeld/pandas
that referenced
this issue
Dec 30, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Bug
Dtype Conversions
Unexpected or buggy dtype conversions
Error Reporting
Incorrect or improved errors from pandas
Indexing
Related to indexing on series/frames, not to indexes themselves
According to my understanding of how boolean indexing should work the below test should pass:
i.e. it shouldn't matter whether or not the boolean mask is a DataFrame or a plain ndarray
Problem description
As can be seen in the picture below, when indexing a DataFrame with a boolean ndarray, rather than
setting the individual
True
elements entire rows are being set if any value in the row isTrue
.i.e. what is happening is the result I would expect from
df[INVALID.any(axis=1)] = NaN
rather thandf[INVALID] = NaN
This is different from how indexing an ndarray works and to me at least is very surprising and entirely unexpected. This behaviour was the cause of a very subtle bug in my code :(
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: