Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.replace() should raise an exception if invalid argument is given #18634

Closed
upkarlidder opened this issue Dec 4, 2017 · 4 comments · Fixed by #31946
Closed

Series.replace() should raise an exception if invalid argument is given #18634

upkarlidder opened this issue Dec 4, 2017 · 4 comments · Fixed by #31946
Labels
API Design Strings String extension data type and string data
Milestone

Comments

@upkarlidder
Copy link

Series.replace() should potentially raise an exception if an invalid argument is given? I am new to pandas and this is more of a question. This is a user error on my part, but possible improvement to replace.

df = pd.DataFrame(
{
    'one': ['1','1 ','10'],
    'two': ['1 ', '20 ', '30 ']
})

# creates
     one  two
0    1    1
1    1    20
2    10    30

# I intentionally added a space in df.one. So df.one.value_counts() results in
1     1
10    1
1     1

# I want to strip the spaces around all values in df.one 
# so that value_counts() only has 1 and 10 with values 2 and 1 respectively. 
# The following does not work.

df.one.replace(lambda x: x.strip(), inplace=True)

Problem description

Series.replace() accepts "str, regex, list, dict, Series, numeric, or None". So I understand why the above does not work. But as @toobaz pointed out on gitter, when a lambda is passed, replace should probably raise an error?

Expected Output

Raises TypeError

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.3.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.8.0
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.14
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.8.1
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added API Design Strings String extension data type and string data labels Dec 5, 2017
@gfyoung
Copy link
Member

gfyoung commented Dec 5, 2017

replace should probably raise an error?

@lidderupk : That makes sense to me. PR to implement is more than welcome!

@upkarlidder
Copy link
Author

@gfyoung: forked the codebase and reading "Contributing to Pandas" now. Is this a beginner friendly PR? I would like to contribute if it is. Thank you.

@toobaz
Copy link
Member

toobaz commented Dec 5, 2017

This is one is not hard, but if you want an even more beginner friendly one, take a look for instance at #18589 . Or you can get the list of easy issues and pick the one you prefer.

@gfyoung
Copy link
Member

gfyoung commented Dec 5, 2017

@lidderupk : There is very much do-able as a beginner. Feel free to ask questions if you're unclear on anything. If you get stuck, submit what you have as a PR, and we can help you out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants