Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pd.read_pickle not backwards compatible from v2.1.0 to v1.3.4 #55137

Closed
3 tasks done
Tracked by #5
elerys opened this issue Sep 14, 2023 · 17 comments · Fixed by #56308
Closed
3 tasks done
Tracked by #5

BUG: pd.read_pickle not backwards compatible from v2.1.0 to v1.3.4 #55137

elerys opened this issue Sep 14, 2023 · 17 comments · Fixed by #56308
Assignees
Labels
Bug IO Pickle read_pickle, to_pickle Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@elerys
Copy link

elerys commented Sep 14, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# in a pandas 1.3.4 environment

import pandas as pd
import numpy as np

df = pd.DataFrame(np.ones((100, 4)))
df.to_pickle("old_pandas.pkl")



# in pandas 2.1.0 environment
import pandas as pd

df = pd.read_pickle("old_pandas.pkl")

Issue Description

The read_pickle method raises an error when attempting to read a file serialized with to_pickle in pandas 1.3.4. No error is raised when using 2.0. to read_pickle, or if the file is serialized with version >=1.4.

The documentation states the method should be backwards compatible to pandas 0.20.3 provided the object was serialized with to_pickle (as in this case).

Traceback (most recent call last):
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/io/pickle.py", line 206, in read_pickle
    return pickle.load(handles.handle)
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
    return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 35, in load_reduce
    stack[-1] = func(*args)
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
    return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/io/pickle.py", line 211, in read_pickle
    return pc.load(handles.handle, encoding=None)
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 225, in load
    return up.load()
  File "/usr/lib/python3.10/pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "/home/me/envs/new_pandas/lib/python3.10/site-packages/pandas/compat/pickle_compat.py", line 55, in load_reduce
    elif args and issubclass(args[0], PeriodArray):
TypeError: issubclass() arg 1 must be a class

Expected Behavior

Method should be backwards compatible, as per the documentation.

Installed Versions

INSTALLED VERSIONS

commit : f00efd0
python : 3.10.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-78-generic
Version : #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 2.2.0dev0+225.gf00efd0344
numpy : 1.25.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 59.6.0
pip : 22.0.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
bs4 : None
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

INSTALLED VERSIONS ------------------ commit : 945c9ed python : 3.10.12.final.0 python-bits : 64 OS : Linux OS-release : 5.15.0-78-generic Version : #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 1.3.4
numpy : 1.25.2
pytz : 2023.3.post1
dateutil : 2.8.2
pip : 22.0.2
setuptools : 59.6.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@elerys elerys added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 14, 2023
@rhshadrach rhshadrach added the IO Pickle read_pickle, to_pickle label Sep 14, 2023
@aterrel
Copy link
Contributor

aterrel commented Sep 15, 2023

Support seems to be removed in #49316

Need to decide if we want to revert back or update the documentation

@aterrel
Copy link
Contributor

aterrel commented Sep 16, 2023

Investigating this more, pandas 0.20.3 still works for this example but not pandas 1.3.4.

Will add both examples in PR.

@aterrel
Copy link
Contributor

aterrel commented Sep 17, 2023

Okay bisected and I see the bug with 1.3.4 starting at 15fd7d7

@timvdlee
Copy link

timvdlee commented Nov 8, 2023

Is there an update on this issue? I am encountering the same bug

@PennyWieser
Copy link

Same bug here.

@PapaNoyel
Copy link

Encountered it today.

@v0lle85
Copy link

v0lle85 commented Nov 17, 2023

I'm affected, too.
Thanks to the comment by tomvannuenen in this issue I tried downgrading pandas to version 2.0.3, which in my case does the trick for now.
(Most likely I could have known by looking up the commit that broke it mentioned by aterrel, but anyway...)

@patticat
Copy link

Same problem. 2.0.3 is the latest version that "works" for me... I've been checking this issue for weeks hoping for the best

@aterrel
Copy link
Contributor

aterrel commented Nov 30, 2023

@mroeschke

@DanielFPerez
Copy link

Facing the same issue

@lithomas1 lithomas1 added Regression Functionality that used to work in a prior pandas version and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 3, 2023
@lithomas1 lithomas1 self-assigned this Dec 3, 2023
@lithomas1 lithomas1 added this to the 2.1.4 milestone Dec 3, 2023
@ferrannoguera
Copy link

I am still facing the same issue with pickles that were created using pandas 1.3.5 and display the same error messages when trying to be read using pandas 2.1.3

Precisely, pickles whose compression was gzip.

@mroeschke
Copy link
Member

A fix for this issue should be available when 2.1.4 is released (either end of the week or next week)

@tgutzler
Copy link

tgutzler commented Jan 9, 2024

Hi, I have a pickle file (uncertain which version it was created with, it's protocol version 4). When I try to open it with pandas 2.1.4, I get this error:
File "[...]/miniconda3/envs/terratest/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

When I load this with pandas 1.4.2 and dump it again, without changing anything, I can load it fine with pandas 2.1.4.
This is all with python 3.9.18.
How can I get to the bottom of this? I don't want to attach the file here but happy to pm it

@lithomas1
Copy link
Member

Hi, I have a pickle file (uncertain which version it was created with, it's protocol version 4). When I try to open it with pandas 2.1.4, I get this error: File "[...]/miniconda3/envs/terratest/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block return klass(values, ndim=ndim, placement=placement, refs=refs) TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

When I load this with pandas 1.4.2 and dump it again, without changing anything, I can load it fine with pandas 2.1.4. This is all with python 3.9.18. How can I get to the bottom of this? I don't want to attach the file here but happy to pm it

Can you open a new issue?

@tgutzler
Copy link

Of course: #56825

@liuty1999
Copy link

I downgrade pandas to 2.0.4, then it works without bugs

@sahilm89
Copy link

I still have this problem in 2.2.2! I had to downgrade to 2.0.3 for pandas to be compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Pickle read_pickle, to_pickle Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.