Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous AssertionError: Index length did not match values on plot #4486

Closed
fonnesbeck opened this issue Aug 6, 2013 · 16 comments · Fixed by #4492
Closed

Erroneous AssertionError: Index length did not match values on plot #4486

fonnesbeck opened this issue Aug 6, 2013 · 16 comments · Fixed by #4492
Labels
Milestone

Comments

@fonnesbeck
Copy link

I have what appears to be a valid Series with a time series index as follows:

In [41]: inactive
Out[41]: 
2010-06-10 20:43:07    37.2
2010-06-21 06:37:28     0.0
2009-07-20 06:53:38     0.0
2012-05-17 01:50:13    27.4
2009-07-27 21:09:15     0.0
2010-05-04 09:06:54     0.0
2010-05-06 03:38:54    32.5
2010-05-09 08:33:16    56.9
2010-05-26 18:58:42     0.0
2010-05-31 18:40:41     0.0
2010-06-13 08:46:17     0.0
2010-06-16 18:27:51     0.0
2010-06-19 11:59:48     0.0
2010-06-24 15:24:26    33.4
2010-06-28 10:54:27     0.0
...
2008-11-02 07:35:22     71.2
2008-11-05 17:33:39     97.6
2009-10-22 23:15:39     75.7
2009-10-27 20:55:07     67.5
2009-06-02 13:57:15     35.8
2009-06-06 14:15:41     25.9
2008-11-07 01:22:46      0.0
2009-10-12 17:08:20    100.0
2010-09-19 10:21:39      0.0
2010-09-24 15:46:50      0.0
2010-09-30 16:48:09      0.0
2010-10-08 21:43:10      0.0
2010-10-15 19:27:11      0.0
2010-10-28 19:10:42      0.0
2010-06-13 10:32:45    100.0
Name: pdgt10, Length: 13536, dtype: float64

Note also:

In [43]: inactive.isnull().sum()
Out[43]: 0

In [45]: inactive.index.size
Out[45]: 13536

In [46]: inactive.values.size
Out[46]: 13536

However, when I try calling inactive.plot() in order to get a time series plot, I receive a puzzling AssertionError:

In [40]: inactive.plot()
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-40-cdcfa4d871e5> in <module>()
----> 1 inactive.plot()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/tools/plotting.pyc in plot_series(series, label, kind, use_index, rot, xticks, yticks, xlim, ylim, ax, style, grid, legend, logx, logy, secondary_y, **kwds)
   1730                      secondary_y=secondary_y, **kwds)
   1731 
-> 1732     plot_obj.generate()
   1733     plot_obj.draw()
   1734 

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/tools/plotting.pyc in generate(self)
    856         self._compute_plot_data()
    857         self._setup_subplots()
--> 858         self._make_plot()
    859         self._post_plot_logic()
    860         self._adorn_subplots()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/tools/plotting.pyc in _make_plot(self)
   1244             lines = []
   1245             labels = []
-> 1246             x = self._get_xticks(convert_period=True)
   1247 
   1248             plotf = self._get_plot_function()

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/tools/plotting.pyc in _get_xticks(self, convert_period)
   1034                 x = index._mpl_repr()
   1035             elif is_datetype:
-> 1036                 self.data = self.data.reindex(index=index.order())
   1037                 x = self.data.index._mpl_repr()
   1038             else:

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/core/series.pyc in reindex(self, index, method, level, fill_value, limit, copy, takeable)
   2649 
   2650         # GH4246 (dispatch to a common method with frame to handle possibly duplicate index)
-> 2651         return self._reindex_with_indexers(new_index, indexer, copy=copy, fill_value=fill_value)
   2652 
   2653     def _reindex_with_indexers(self, index, indexer, copy, fill_value):

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/core/series.pyc in _reindex_with_indexers(self, index, indexer, copy, fill_value)
   2653     def _reindex_with_indexers(self, index, indexer, copy, fill_value):
   2654         new_values = com.take_1d(self.values, indexer, fill_value=fill_value)
-> 2655         return Series(new_values, index=index, name=self.name)
   2656 
   2657     def reindex_axis(self, labels, axis=0, **kwargs):

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/core/series.pyc in __new__(cls, data, index, dtype, name, copy)
    491         else:
    492             subarr = subarr.view(Series)
--> 493         subarr.index = index
    494         subarr.name = name
    495 

/Library/Python/2.7/site-packages/pandas-0.12.0_80_gfcaf9a6_20130729-py2.7-macosx-10.8-intel.egg/pandas/lib.so in pandas.lib.SeriesIndex.__set__ (pandas/lib.c:29092)()

AssertionError: Index length did not match values

As I have shown above, my index length and my values are the same length.

Running a recent (<2 weeks) build from master on OS X 10.8.4 and Python 2.7.2.

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

This has shown up in my attempts to fix some of the period index joining ops.. I'll see if I can track it down

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

i can't reproduce this with the following code

def randts(start, stop, size=1, klass=Series, name=None):
    scalar_mapper = pd.Timestamp
    start, stop = scalar_mapper(start), scalar_mapper(stop)
    mapper = np.vectorize(scalar_mapper)
    values = np.random.random_integers(start.asm8.view(int),
                                       stop.asm8.view(int), size=size)
    return klass(mapper(values), name=name or rands(10))
n = 13536
index = tm.randts(Timestamp('1/1/2009'), Timestamp('1/1/2012'), size=n,
                    name='pdgt10', klass=Index)
values = randn(n)
s = Series(values, index=index)
s.plot()

I get the following plot:

Uploading rand-index-plot.png . . .

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

I'll try with the commit you're on

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

hm it works on that commit...

@fonnesbeck
Copy link
Author

Your example works here. Let me put together an example where it fails.

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

sorry that plot is not showing up my connection is weird.....public wifi is not always so reliable

@fonnesbeck
Copy link
Author

OK, here is a failing example.

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

I get all NaN in foo...is that the issue? Just that it should give a better error message?

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

ahhh nevermind

it should be

foo = Series(inactive.pdgt10.values, inactive.st_time.values)

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

i can repro with that

thanks @fonnesbeck

@fonnesbeck
Copy link
Author

Right, sorry!

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

strangely (or maybe not) if i sort the index then i get a plot

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

maybe the dupe index value are throwing it off

@cpcloud
Copy link
Member

cpcloud commented Aug 6, 2013

either way i'll fix

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

yep it's the dupe index that is the issue...investigating

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2013

simple reproducible example (using tm.rands()) from above

n = 10
index = tm.randts(1, n, size=n * 2, klass=Index)
values = randn(index.size)
s = Series(values, index=index)
s.plot()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants