You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For this specific case it seems reasonable that __finalize__ should be used since all of the elements are from the same dataframe, though I'm not sure about the general use since concat can also take types other than a DataFrame. But should we/do we have some method to stack dataframes that preserves metadata?
#6927 wll fix/test this; but this is all up to the user at this point. E.g. the default will do nothing. If their is some reasonable set of rules we can adopt them in the future. The problem is just about everything is arbitrary at this point.
In [1]: o = Series(range(3),range(3))
In [2]: o.name = 'foo'
In [3]: o2 = Series(range(3),range(3))
In [4]: o2.name = 'bar'
In [5]: Series._metadata = ['name','filename']
In [6]: o.filename = 'foo'
In [7]: o2.filename = 'bar'
: def finalize(self, other, method=None, **kwargs):
: for name in self._metadata:
: if method == 'concat' and name == 'filename':
: value = '+'.join([ getattr(o,name) for o in other.objs if getattr(o,name,None) ])
: object.__setattr__(self, name, value)
: else:
: object.__setattr__(self, name, getattr(other, name, None))
:
: return self
In [13]: Series.__finalize__ = finalize
In [14]: result = pd.concat([o, o2])
In [15]: result.name
In [16]: result.filename
Out[16]: 'foo+bar'
When I assign metadata to a df
and define a
__finalize__
that prints when it's callednothing is preserved when
pd.concat
is called:For this specific case it seems reasonable that
__finalize__
should be used since all of the elements are from the same dataframe, though I'm not sure about the general use sinceconcat
can also take types other than a DataFrame. But should we/do we have some method to stack dataframes that preserves metadata?Similar to #6923.
The text was updated successfully, but these errors were encountered: