REGR: Buffer dtype mismatch, expected 'Python object' but got a string #38900

forgetso · 2021-01-02T14:42:12Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here
>>> import pandas as pd
>>> df = pd.DataFrame(['o',"o'doul's",'oad','oaf','oafish'])
>>> df = df.astype('|S')
>>> df
             0
0         b'o'
1  b"o'doul's"
2       b'oad'
3       b'oaf'
4    b'oafish'
>>> import numpy as np
>>> df.replace({None:np.nan})

Problem description

This was working in 1.1.5

    df[c].replace({None: np.NaN}, inplace=True)
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/series.py", line 4479, in replace
    return super().replace(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/generic.py", line 6840, in replace
    return self.replace(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/series.py", line 4479, in replace
    return super().replace(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/generic.py", line 6889, in replace
    new_data = self._mgr.replace_list(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 664, in replace_list
    bm = self.apply(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 427, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 910, in _replace_list
    [b.convert(numeric=False, copy=True) for b in result]
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 910, in <listcomp>
    [b.convert(numeric=False, copy=True) for b in result]
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 2558, in convert
    values = f(None, self.values.ravel(), None)
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 2542, in f
    values = soft_convert_objects(
  File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1139, in soft_convert_objects
    values = lib.maybe_convert_objects(values, convert_datetime=True)
  File "pandas/_libs/lib.pyx", line 2125, in pandas._libs.lib.maybe_convert_objects
ValueError: Buffer dtype mismatch, expected 'Python object' but got a string

Expected Output

Nones are replaced with nans

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit : b5958ee python : 3.8.5.final.0 python-bits : 64 OS : Linux OS-release : 4.15.0-128-generic Version : #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 1.1.5
numpy : 1.19.4
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 50.3.2
Cython : 3.0a6
pytest : 6.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.2
html5lib : None
pymysql : None
psycopg2 : 2.8.5 (dt dec pq3 ext lo64)
jinja2 : 3.0.0a1
IPython : 7.17.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.20
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : 0.52.0

The text was updated successfully, but these errors were encountered:

jreback · 2021-01-02T15:03:21Z

pls show a complete example
eg how the original frame is constructed

forgetso · 2021-01-02T17:48:10Z

pls show a complete example
eg how the original frame is constructed

I've added an example df showing how to replicate the bug.

mzeitlin11 · 2021-01-02T19:50:08Z

Also raises with same example for a simpler frame like
df = pd.DataFrame(['o'])

First bad commit 3f51060, PR (#38151)

simonjayhawkins · 2021-01-03T11:41:02Z

First bad commit 3f51060, PR (#38151)

cc @jbrockmendel

jbrockmendel · 2021-01-04T19:13:50Z

It isn't clear to me how #38151 broke this, but the/a fix is in ObjectBlock._maybe_coerce_values, where we need to check for issubclass(values.dtype.type, (str, bytes)) instead of issubclass(values.dtype.type, str)

forgetso added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 2, 2021

simonjayhawkins added Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 2, 2021

mzeitlin11 added Regression Functionality that used to work in a prior pandas version Strings String extension data type and string data replace replace method and removed Bug Needs Info Clarification about behavior needed to assess issue labels Jan 2, 2021

simonjayhawkins added this to the 1.2.1 milestone Jan 3, 2021

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Jan 6, 2021

code sample for pandas-dev#38900

7912621

simonjayhawkins changed the title ~~BUG: Buffer dtype mismatch, expected 'Python object' but got a string~~ REGR: Buffer dtype mismatch, expected 'Python object' but got a string Jan 6, 2021

phofl mentioned this issue Jan 9, 2021

Regression in replace raising ValueError for bytes object #39065

Merged

4 tasks

jreback closed this as completed in #39065 Jan 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REGR: Buffer dtype mismatch, expected 'Python object' but got a string #38900

REGR: Buffer dtype mismatch, expected 'Python object' but got a string #38900

forgetso commented Jan 2, 2021 •

edited

Loading

jreback commented Jan 2, 2021

forgetso commented Jan 2, 2021

mzeitlin11 commented Jan 2, 2021

simonjayhawkins commented Jan 3, 2021

jbrockmendel commented Jan 4, 2021

REGR: Buffer dtype mismatch, expected 'Python object' but got a string #38900

REGR: Buffer dtype mismatch, expected 'Python object' but got a string #38900

Comments

forgetso commented Jan 2, 2021 • edited Loading

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

jreback commented Jan 2, 2021

forgetso commented Jan 2, 2021

mzeitlin11 commented Jan 2, 2021

simonjayhawkins commented Jan 3, 2021

jbrockmendel commented Jan 4, 2021

forgetso commented Jan 2, 2021 •

edited

Loading

Output of `pd.show_versions()`