Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected behavior when assigning data to a multicolumn dataframe #5508

Closed
twmr opened this issue Nov 13, 2013 · 4 comments · Fixed by #5512
Closed

unexpected behavior when assigning data to a multicolumn dataframe #5508

twmr opened this issue Nov 13, 2013 · 4 comments · Fixed by #5512
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@twmr
Copy link

twmr commented Nov 13, 2013

Using the latest git version of pandas (705b677) I have exerienced the
following problems:

In [1]: a = pd.DataFrame(index=pd.Index(xrange(1,11)))

In [2]: a['foo'] = np.zeros(10, dtype=np.float)

In [3]: a['bar'] = np.zeros(10, dtype=np.complex)

In [4]: a.ix[2:5, 'bar']
Out[4]:
2    0j
3    0j
4    0j
5    0j
Name: bar, dtype: complex128

In [5]: a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2]) 
# invalid input (RHS has wrong size) -> does not throw an exception! 
# (The reason why no exception is thrown is because of the different 
# dtype of a['foo'] - see ``In[9]-In[10]``

In [6]: a
Out[6]:
    foo    bar
1     0     0j
2     0  2.33j
3     0  2.33j
4     0  2.33j
5     0  2.33j
6     0     0j
7     0     0j
8     0     0j
9     0     0j
10    0     0j

In [7]: a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2, 1.0]) # valid

In [8]: a
Out[8]:
    foo          bar
1     0           0j
2     0        2.33j
3     0  (1.23+0.1j)
4     0     (2.2+0j)
5     0       (1+0j)
6     0           0j
7     0           0j
8     0           0j
9     0           0j
10    0           0j


In [9]: a = pd.DataFrame(index=pd.Index(xrange(1,11)))

In [10]: a['bar'] = np.zeros(10, dtype=np.complex)

In [11]: a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2]) 
# invalid RHS-> exception raised  OK!
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-bde45910cde6> in <module>()
----> 1 a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2])

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in __setitem__(self, key, value)
     92             indexer = self._convert_to_indexer(key, is_setter=True)
     93
---> 94         self._setitem_with_indexer(indexer, value)
     95
     96     def _has_valid_type(self, k, axis):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value)
    387                 value = self._align_panel(indexer, value)
    388
--> 389             self.obj._data = self.obj._data.setitem(indexer,value)
    390             self.obj._maybe_update_cacher(clear=True)
    391

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, *args, **kwargs)
   2182
   2183     def setitem(self, *args, **kwargs):
-> 2184         return self.apply('setitem', *args, **kwargs)
   2185
   2186     def putmask(self, *args, **kwargs):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in apply(self, f, *args, **kwargs)
   2162
   2163             else:
-> 2164                 applied = getattr(blk, f)(*args, **kwargs)
   2165
   2166             if isinstance(applied, list):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, indexer, value)
    580         try:
    581             # set and return a block
--> 582             values[indexer] = value
    583
    584             # coerce and try to infer the dtypes of the result

ValueError: could not broadcast input array from shape (3) into shape (4)

Here is the 2nd problem:

In [1]: b = pd.DataFrame(index=pd.Index(xrange(1,11)))

In [2]: b['foo'] = np.zeros(10, dtype=np.float)

In [3]: b['bar'] = np.zeros(10, dtype=np.complex)

In [4]: b
Out[4]:
    foo  bar
1     0   0j
2     0   0j
3     0   0j
4     0   0j
5     0   0j
6     0   0j
7     0   0j
8     0   0j
9     0   0j
10    0   0j

In [5]: b[2:5]
Out[5]:
   foo  bar
3    0   0j
4    0   0j
5    0   0j

In [6]: b[2:5] = np.arange(1,4)*1j 
# invalid input (wrong size on RHS)

In [7]: b
Out[7]:
    foo  bar
1    0j   0j
2    0j   0j
3    1j   2j
4    1j   2j
5    1j   2j
6    0j   0j
7    0j   0j
8    0j   0j
9    0j   0j
10   0j   0j

# why does the expression in ``In [6]`` change the dtype of b['foo']. 
# Is this intended ?

In [8]: b[2:5] = np.arange(1,4)*1j # invalid input (wrong size on RHS)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-a729140e3f7f> in <module>()
----> 1 b[2:5] = np.arange(1,4)*1j # invalid input (wrong size on RHS)

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/frame.pyc in __setitem__(self, key, value)
   1831         indexer = _convert_to_index_sliceable(self, key)
   1832         if indexer is not None:
-> 1833             return self._setitem_slice(indexer, value)
   1834
   1835         if isinstance(key, (Series, np.ndarray, list)):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/frame.pyc in _setitem_slice(self, key, value)
   1842
   1843     def _setitem_slice(self, key, value):
-> 1844         self.ix._setitem_with_indexer(key, value)
   1845
   1846     def _setitem_array(self, key, value):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value)
    387                 value = self._align_panel(indexer, value)
    388
--> 389             self.obj._data = self.obj._data.setitem(indexer,value)
    390             self.obj._maybe_update_cacher(clear=True)
    391

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, *args, **kwargs)
   2182
   2183     def setitem(self, *args, **kwargs):
-> 2184         return self.apply('setitem', *args, **kwargs)
   2185
   2186     def putmask(self, *args, **kwargs):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in apply(self, f, *args, **kwargs)
   2162
   2163             else:
-> 2164                 applied = getattr(blk, f)(*args, **kwargs)
   2165
   2166             if isinstance(applied, list):

/home/thomas/.local/lib/python2.7/site-packages/pandas-0.12.0_1098_g705b677-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, indexer, value)
    580         try:
    581             # set and return a block
--> 582             values[indexer] = value
    583
    584             # coerce and try to infer the dtypes of the result

ValueError: could not broadcast input array from shape (3) into shape (3,2)
@jreback
Copy link
Contributor

jreback commented Nov 14, 2013

thanks for the report...PR #5512 was merged..go ahead an give a try...lmk any more issues!

@twmr
Copy link
Author

twmr commented Nov 14, 2013

You are great! This PR fixed the problems

@jreback
Copy link
Contributor

jreback commented Nov 14, 2013

gr8! keep the reports coming....

@tamlt2704
Copy link

I have this issues with new version of pandas

import pandas as pd
import numpy as np
print pd.__version__
a = pd.DataFrame(index=pd.Index(xrange(1,11)))
a['bar'] = np.zeros(10, dtype=np.complex)
a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2]) 

ValueError Traceback (most recent call last)
in ()
4 a = pd.DataFrame(index=pd.Index(xrange(1,11)))
5 a['bar'] = np.zeros(10, dtype=np.complex)
----> 6 a.ix[2:5, 'bar'] = np.array([2.33j, 1.23+0.1j, 2.2])

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in setitem(self, key, value)
192 key = com._apply_if_callable(key, self.obj)
193 indexer = self._get_setitem_indexer(key)
--> 194 self._setitem_with_indexer(indexer, value)
195
196 def _has_valid_type(self, k, axis):

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value)
637 self.obj._consolidate_inplace()
638 self.obj._data = self.obj._data.setitem(indexer=indexer,
--> 639 value=value)
640 self.obj._maybe_update_cacher(clear=True)
641

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in setitem(self, **kwargs)
3439
3440 def setitem(self, **kwargs):
-> 3441 return self.apply('setitem', **kwargs)
3442
3443 def putmask(self, **kwargs):

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
3327
3328 kwargs['mgr'] = self
-> 3329 applied = getattr(b, f)(**kwargs)
3330 result_blocks = _extend_blocks(applied, result_blocks)
3331

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in setitem(self, indexer, value, mgr)
899 # set
900 else:
--> 901 values[indexer] = value
902
903 # coerce and try to infer the dtypes of the result

ValueError: could not broadcast input array from shape (3) into shape (4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
3 participants