You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import pandas as pd
dates = [ 20140101, 20140102, 20140103]
states = [ "CA", "NY", "CA"]
x = pd.DataFrame({ 'dates' : dates, 'states' : states })
#y = pandas.DataFrame({ 'state' : [ 'CA', 'OR' ], 'value' : [ 1, 2]})
y = pd.DataFrame({ 'states' : [ "CA", "NY" ], 'stateid' : [ 1, 2]})
z = pd.merge(x, y, how='left', on='states',)
x=
dates states
0 20140101 CA
1 20140102 NY
2 20140103 CA
y=
stateid states
0 1 CA
1 2 NY
z= dates states stateid
0 20140101 CA 1
1 20140103 CA 1
2 20140102 NY 2
Note z is always sorted by "states" column whether argument sort=True or False.
This only happens when the x's states column is not unique. If x.states is unique(such as NY, CA, CT), sort=True and False behaves as expected.
This causes inconvenience when x is a time series and merge does not
preserve the time sequence.
The text was updated successfully, but these errors were encountered:
hmm, could be a subtle bug in there; this didn't work in prior versions, but simply may be a function of the impl. if you have time and want to wade thru code, welcome to have a look!
Note z is always sorted by "states" column whether argument sort=True or False.
This only happens when the x's states column is not unique. If x.states is unique(such as NY, CA, CT), sort=True and False behaves as expected.
This causes inconvenience when x is a time series and merge does not
preserve the time sequence.
The text was updated successfully, but these errors were encountered: