Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError on datetime.timedelta assignment after reindex #8209

Closed
flaviovs opened this issue Sep 8, 2014 · 11 comments · Fixed by #8184
Closed

AttributeError on datetime.timedelta assignment after reindex #8209

flaviovs opened this issue Sep 8, 2014 · 11 comments · Fixed by #8184
Labels
Bug Timedelta Timedelta data type
Milestone

Comments

@flaviovs
Copy link

flaviovs commented Sep 8, 2014

In the code below, I expected that the assignment didn't fail, since everything seems "right" . However, AttributeError is being raised.

A bug? Am I doing something wrong?

In [106]: import pandas as pd

In [107]: import datetime as dt

In [108]: s = pd.Series([])

In [109]: s
Out[109]: Series([], dtype: float64)

In [110]: s.loc['B'] = dt.timedelta(1)

In [111]: s
Out[111]: 
B   1 days
dtype: timedelta64[ns]

In [112]: s = s.reindex(s.index.insert(0, 'A'))

In [113]: s
Out[113]: 
A      NaT
B   1 days
dtype: timedelta64[ns]

In [114]: s.loc['A'] = dt.timedelta(1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-114-f662acce72ae> in <module>()
----> 1 s.loc['A'] = dt.timedelta(1)
/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in __setitem__(self, key, value)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, **kwargs)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in apply(self, f, axes, filter, do_integrity_check, **kwargs)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in setitem(self, indexer, value)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in _try_coerce_args(self, values, other)

/home/flaviovs/.virtualenv/hm/local/lib/python2.7/site-packages/pandas-0.14.1_354_g12248ff-py2.7-linux-x86_64.egg/pandas/core/internals.pyc in masker(v)

AttributeError: 'datetime.timedelta' object has no attribute 'view'

Tried with 0.14.1 and latest on GH (v0.15pre) , to no avail.

I'm so sorry I'm not able to provide a patch right now (still wrapping my head around pandas internals -- any directions about how to fix this is issue will be appreciated, though).

Versions:

In [125]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.2.60
machine: x86_64
processor: 
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8

pandas: 0.14.1-354-g12248ff
nose: None
Cython: 0.20.1
numpy: 1.8.1
scipy: None
statsmodels: None
IPython: 2.1.0
sphinx: None
patsy: None
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.3
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.1.1rc2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None
@jreback jreback added Bug Timedelta Timedelta data type labels Sep 8, 2014
@jreback jreback added this to the 0.15.0 milestone Sep 8, 2014
@jreback
Copy link
Contributor

jreback commented Sep 8, 2014

FIxed in the refactoring of timedeltas in general (#8184), here's the fix: jreback@ea09610

in the interim you can simply do:

s = s.fillna(timedelta(1)) or s.loc['A'] = np.timedelta64(1,'D')

The issue was its not coercing the timedeltas correctly

@jtratner
Copy link
Contributor

jtratner commented Sep 8, 2014

@jreback I was just looking at this too - is TimeDeltaBlock missing a _can_hold_element method, or is there some reason why that never gets called?

@jtratner
Copy link
Contributor

jtratner commented Sep 8, 2014

ie., something like this diff:

diff --git a/pandas/core/internals.py b/pandas/core/internals.py
index 6672546..cfbdcd7 100644
--- a/pandas/core/internals.py
+++ b/pandas/core/internals.py
@@ -1224,6 +1224,16 @@ class TimeDeltaBlock(IntBlock):
     _can_hold_na = True
     is_numeric = False

+    def _can_hold_element(self, value):
+        if is_list_like(value):
+            value = np.array(value)
+            typ = value.dtype.type
+            return (issubclass(typ, (np.integer, np.timedelta64, np.float_)) and not
+                    issubclass(typ, np.datetime64))
+        return (isinstance(value, (np.timedelta64, timedelta, np.float_,
+                                  float)) or
+                           com.is_integer(value) or _is_null_datelike_scalar(value))
+
     @property
     def fill_value(self):
         return tslib.iNaT

@jreback
Copy link
Contributor

jreback commented Sep 8, 2014

No, it can ONLY hold integers! (and its a sub-class of IntBlock). Things have to be coerced beforehand. (I guess technically it might need this, but only putmask calls it). If you can make a test case I guess can add.

regardless, this Bug will is fixed already in #8184

@jtratner
Copy link
Contributor

jtratner commented Sep 8, 2014

@jreback no no - I was just wrong about the level of the bug. Trying to dip my foot back into internals again (just a little bit)

@jreback
Copy link
Contributor

jreback commented Sep 8, 2014

@jtratner hah, no prob. as I said, this _can_hold_element biz is not used much as almost all coercion happens at a higher level (and mostly by the _try_.... methods.

@flaviovs
Copy link
Author

flaviovs commented Sep 8, 2014

Thanks for the reply, +1 for the refactor PR!

BTW, a side question: is there a way to force pandas to store datetime.timedelta as object dtypes? The intuitive solution does not work:

In [57]: s = pd.Series([dt.timedelta(days=1)], dtype=object)

In [58]: s
Out[58]: 
0   1 days
dtype: timedelta64[ns]

Neither via numpy arrays:

In [91]: tdarr = np.array([dt.timedelta(days=1)])

In [92]: tdarr
Out[92]: array([datetime.timedelta(1)], dtype=object)

In [93]: s = pd.Series(tdarr)

In [94]: s
Out[94]: 
0   1 days
dtype: timedelta64[ns]

I understand that this will be a non-issue when the PR is merged, but in the meantime I have to fix my app which expects datetime.timedeltas. I wouldn't mind to switch to np.timedelta64, but it's undeniable that working with np.timestamp64 is far from being convenient (e.g. no "days" property, "total_seconds()" etc.). And (AFAIK) there is no easy, standard way to bring a np.timestamp64 back to datetime.timedelta, which only adds to the PITA...

Thanks again.

@jtratner
Copy link
Contributor

jtratner commented Sep 8, 2014

@flaviovs that used to work :)

There's actually a really easy way to convert timedelta64 to python timedelta:

In [25]: import numpy as np

In [26]: import datetime as dt

In [27]: arr = np.array([np.timedelta64(dt.timedelta(232)), np.timedelta64(dt.timedelta(13))])

In [28]: arr.astype(object)
Out[28]: array([datetime.timedelta(232), datetime.timedelta(13)], dtype=object)

(and that works on individual timedelta objects too)

@jreback
Copy link
Contributor

jreback commented Sep 8, 2014

@flaviovs you can use @jtratner soln in the interim, or if you are up for trying out the new PR (will be merged soon), then this will work (which is very similar to how Timestamp/datetimes work):

In [8]: s = pd.Series([dt.timedelta(days=1)])

In [9]: s
Out[9]: 
0   1 days
dtype: timedelta64[ns]

In [10]: s.iloc[0]
Out[10]: Timedelta('1 days, 00:00:00')

In [11]: s.astype(object)
Out[11]: 
0    1 day, 0:00:00
dtype: object

In [12]: s.astype(object).iloc[0]
Out[12]: datetime.timedelta(1)

In [13]: s.values
Out[13]: array([86400000000000], dtype='timedelta64[ns]')

In [15]: pd.TimedeltaIndex(s.values)
Out[15]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
[1 days]
Length: 1, Freq: None

In [17]: pd.TimedeltaIndex(s.values).to_pytimedelta()
Out[17]: array([datetime.timedelta(1)], dtype=object)

@flaviovs
Copy link
Author

flaviovs commented Sep 9, 2014

@jtratner, your solution uses numpy arrays. My conceptual question is how to have a pandas.Series containint datetime.timedelta's stored as object dtypes...
But nevermind, I found a workaround until this issue is settled.
Thank you all!

p.s.: I'm relatively new to GitHub and not used to issues workflow, thus I'm not closing this issue. Feel free to close it, as I'm finished. Let me know if you prefer I do the close. Thx

@jreback
Copy link
Contributor

jreback commented Sep 9, 2014

@flaviovs this will be close when I merge in the PR. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants