Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: Barplot broken on Index(dtype='object') #38947

Closed
douglas-raillard-arm opened this issue Jan 4, 2021 · 10 comments · Fixed by #46451
Closed

REGR: Barplot broken on Index(dtype='object') #38947

douglas-raillard-arm opened this issue Jan 4, 2021 · 10 comments · Fixed by #46451
Assignees
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version Visualization plotting
Milestone

Comments

@douglas-raillard-arm
Copy link

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

df = pd.DataFrame(
    {
        'a': [1, 2],
    },
    index=pd.Index([0, 'Total'])
)
df.plot.bar()

Problem description

On pandas 1.2.0, this raises:

Traceback (most recent call last):
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 706, in astype
    casted = self._values.astype(dtype, copy=copy)
ValueError: invalid literal for int() with base 10: 'Total'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "testplot.py", line 21, in <module>
    df.plot.bar()
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/plotting/_core.py", line 1113, in bar
    return self(kind="bar", x=x, y=y, **kwargs)
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/plotting/_core.py", line 955, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py", line 61, in plot
    plot_obj.generate()
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 280, in generate
    self._make_plot()
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 1434, in _make_plot
    self.tick_pos = ax.convert_xunits(self.ax_index).astype(np.int)
  File "/home/raillard/WFH/LISA/.lisa-venv-3.8/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 708, in astype
    raise TypeError(
TypeError: Cannot cast Index to dtype int64

Expected Output

On pandas 1.1.5, it works without issues and produces a plot with 2 bars, one labeled "0" and the other one "Total" as expected.

Output of pd.show_versions()

Python 3.8.7 (default, Dec 24 2020, 17:53:09)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import pandas as pd
pd.show_versions()

INSTALLED VERSIONS

commit : 3e89b4c
python : 3.8.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.3-arch1-1
Version : #1 SMP PREEMPT Sun, 27 Dec 2020 10:50:46 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : fr_FR.UTF-8
LOCALE : fr_FR.UTF-8

pandas : 1.2.0
numpy : 1.19.4
pytz : 2020.5
dateutil : 2.8.1
pip : 20.3.3
setuptools : 51.1.1
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : 3.4.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 2.0.0
pyxlsb : None
s3fs : None
scipy : 1.6.0
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@douglas-raillard-arm douglas-raillard-arm added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 4, 2021
@douglas-raillard-arm douglas-raillard-arm changed the title BUG: Barplot broken on Index(dtype='object') REGR: Barplot broken on Index(dtype='object') Jan 4, 2021
@mzeitlin11
Copy link
Member

Thanks for the report @douglas-raillard-arm!

First bad commit fb379d8, (#28733)

xref #38736, #38865

@mzeitlin11 mzeitlin11 added Regression Functionality that used to work in a prior pandas version Visualization plotting and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 4, 2021
@jorisvandenbossche jorisvandenbossche added this to the 1.2.1 milestone Jan 4, 2021
@arc12
Copy link

arc12 commented Jan 14, 2021

I suppose this identified bug also causes the following to fail with a TypeError: 'value' must be an instance of str or bytes, not a float

pd.Series(data=range(6), index=["A"] * 4 + [np.nan] * 2).plot(kind='bar')

This is fairly common kind of index arising from value_counts(dropna=False) in numerous places (for me)

@simonjayhawkins simonjayhawkins added the Needs Tests Unit test(s) needed to prevent regressions label Jan 18, 2021
@simonjayhawkins
Copy link
Member

First bad commit fb379d8, (#28733)

#28733 has been reverted. needs tests to prevent regressions.

@simonjayhawkins simonjayhawkins modified the milestones: 1.2.1, Contributions Welcome Jan 18, 2021
@cckang
Copy link

cckang commented Aug 15, 2021

Test below code using pandas version ==> 1.4.0.dev0+424.gab5322ccd6

df = pd.DataFrame(
    {
        'a' : [1, 2],
    },
    index=pd.Index([0, 'Total'])
)

df.plot.bar()

I did not encounter any compilation errors when running the above code. However, it does not produce any plot.

@sriakshata
Copy link

With Pandas 1.3.1, it works fine. I think the problem is only with the 1.2.0 version of pandas. Apart from that is there any other problem? I might not have understood correctly, so please tell me if I am wrong. I am new to open source projects.
Screenshot (316)

@cckang
Copy link

cckang commented Aug 21, 2021

With Pandas 1.3.1, it works fine. I think the problem is only with the 1.2.0 version of pandas. Apart from that is there any other problem? I might not have understood correctly, so please tell me if I am wrong. I am new to open source projects.
Screenshot (316)

i am new to open source also. Learning how to controubute :)

@sriakshata
Copy link

@douglas-raillard-arm , I am actually new to open source. My question is , it works fine with the pandas v1.3.2, I guess the problem is only with the pandas v1.2.0, since it works fine with the latest release, do we have to fix it in the pandas v1.2.0? Please guide me.

@simonjayhawkins
Copy link
Member

no fixes required. the issue is to add a test to the pandas test suite to ensure we don't get a similar regression in the future.

@SabrinaMB
Copy link

please assign me to this issue

@Daquisu
Copy link
Contributor

Daquisu commented Mar 20, 2022

take

@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.5 Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version Visualization plotting
Projects
None yet
10 participants