-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_csv with iterator=True and engine = "python" -> error in if self.usecols
#12546
Comments
can you show using supplied data and |
The cause if the problem is, as far as I know, that use_cols takes a python's list as a value, and the semantics of
Which actually make this use case plausible and correct according to docs. I think we should either change the documentation to list and add a strict type check, or inject something in the lines of use_cols = np.asarray(use_cols, dtype=int).ravel().tolist() Into _parser_f of read_csv |
Here's how we can reproduce this: >>> from pandas import read_csv
>>> from pandas.compat import StringIO
>>> import numpy as np
>>> usecols = np.array(['a', 'b'])
>>> data = 'a,b,c\n1,2,3'
>>> read_csv(StringIO(data), usecols=usecols, engine='python')
...
ValueError: The truth value of an array with more than one element is ambiguous
. Use a.any() or a.all() Note that the C engine is perfectly happy. |
Code Sample, a copy-pastable example if possible
Following example returns error if run with
engine = "python"
:Error
output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.3.2.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-573.12.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.0rc1+80.g820e110.dirty
nose: 1.3.0
pip: 8.0.2
setuptools: 0.9.8
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.12.1
statsmodels: None
xarray: None
IPython: 4.1.0-dev
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.4.7.dev0
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
The text was updated successfully, but these errors were encountered: