Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR:change DataFrame.where(..,raise_on_error) -> errors #14968

Closed
m-charlton opened this issue Dec 22, 2016 · 4 comments · Fixed by #17744
Closed

DEPR:change DataFrame.where(..,raise_on_error) -> errors #14968

m-charlton opened this issue Dec 22, 2016 · 4 comments · Fixed by #17744
Labels
API Design Deprecate Functionality to remove in pandas
Milestone

Comments

@m-charlton
Copy link
Contributor

xref #14878

#14878 deprecated the raise_on_error kwarg in favour of the errors kwarg
for DataFrame.astype().

This issue addresses the same deprecation of the raise_on_error kwarg for
DataFrame.where(). Again this should be replaced with the errors kwarg.

The errors kwarg can have the value raise or ignore with a default
value of raise.

This issue can be assigned to me - @m-charlton

@jreback jreback added the Deprecate Functionality to remove in pandas label Dec 22, 2016
@jreback jreback added this to the 0.20.0 milestone Dec 22, 2016
@m-charlton
Copy link
Contributor Author

I'll start on this tomorrow (Wednesday 4th Jan 2017)

@m-charlton
Copy link
Contributor Author

It looks as if the value of raise_on_error has no effect on the raising of
exceptions for the where method.

According to the docstring for DataFrame.where

raise_on_error : boolean, default True
     Whether to raise on invalid data types (e.g. trying to where on strings)

Then attempting to compare against a string for a DataFrame of say integers,
will result in an exception, this is the case.

For example in the following, raise_on_error is set to 'False'. As I'm
attempting to compare the string a with integer data then I get a 'TypeError'
exception, but the exception should be suppressed and the DataFrame should be
returned as it is in DataFrame.astype()

>>> import pandas as pd
>>> df = pd.DataFrame(data=[1,2,3])
>>> df.where(df == 'a', raise_on_error=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../pandas/core/ops.py", line 1290, in f
    res = self._combine_const(other, func, raise_on_error=False)
  File ".../pandas/core/frame.py", line 3634, in _combine_const
    raise_on_error=raise_on_error)
  File ".../pandas/core/internals.py", line 3162, in eval
    return self.apply('eval', **kwargs)
  File ".../pandas/core/internals.py", line 3056, in apply
    applied = getattr(b, f)(**kwargs)
  File ".../pandas/core/internals.py", line 1181, in eval
    repr(other))
TypeError: Could not compare ['a'] with block values

In File .../pandas/core/ops.py. The 'raise_on_error' kwarg is always set to
'False' [L1290]:

1288             # straight boolean comparisions we want to allow all columns
1289             # (regardless of dtype to pass thru) See #4537 for discussion.
1290             res = self._combine_const(other, func, raise_on_error=False)
1291             return res.fillna(True).astype(bool)

This is an aside, the problem seems to be in pandas/core/internals.py

1157         # get the result
1158         try:
1159             with np.errstate(all='ignore'):
1160                 result = get_result(other)
1161 
1162         # if we have an invalid shape/broadcast error
1163         # GH4576, so raise instead of allowing to pass through
1164         except ValueError as detail:
1165             raise
1166         except Exception as detail:
1167             result = handle_error()
1168 
1169         # technically a broadcast error in numpy can 'work' by returning a
1170         # boolean False
1171         if not isinstance(result, np.ndarray):
1172             if not isinstance(result, np.ndarray):
1173 
1174                 # differentiate between an invalid ndarray-ndarray comparison
1175                 # and an invalid type comparison
1176                 if isinstance(values, np.ndarray) and is_list_like(other):
1177                     raise ValueError('Invalid broadcasting comparison [%s] '
1178                                      'with block values' % repr(other))
1179 
1180                 raise TypeError('Could not compare [%s] with block values' %
1181                                 repr(other))

The 'TypeError' is raised at [L1181], but the 'handle_error()' [L1167] is where
the raise_on_error argument is evaluated.

So in this particular case the 'TypeError' exception will always be raised.

Also the expressions on [L1171] & [L1172] are the same which looks like a
mistake.

@jreback
Copy link
Contributor

jreback commented Mar 23, 2017

@m-charlton PR for this?

@jreback
Copy link
Contributor

jreback commented May 20, 2017

PR for this?

jreback added a commit to jreback/pandas that referenced this issue Oct 2, 2017
jreback added a commit to jreback/pandas that referenced this issue Oct 3, 2017
jreback added a commit to jreback/pandas that referenced this issue Oct 5, 2017
jreback added a commit to jreback/pandas that referenced this issue Oct 5, 2017
ghost pushed a commit to reef-technologies/pandas that referenced this issue Oct 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants