python - Merge a lot of DataFrames together, without loop and not using concat -


i have >1000 dataframes, each have >20k rows , several columns, need merge common column, idea can illustrated this:

data1=pd.dataframe({'name':['a','c','e'], 'value':[1,3,4]}) data2=pd.dataframe({'name':['a','d','e'], 'value':[3,3,4]}) data3=pd.dataframe({'name':['d','e','f'], 'value':[1,3,5]}) data4=pd.dataframe({'name':['d','f','g'], 'value':[0,3,4]}) #some or them may have more or less columns others: #data5=pd.dataframe({'name':['d','f','g'], 'value':[0,3,4], 'score':[1,3,4]})  final_data=data1 i, v in enumerate([data2, data3, data4]):     if i==0:         final_data=pd.merge(final_data, v, how='outer', left_on='name',                              right_on='name', suffixes=('_0', '_%s'%(i+1)))         #in real case right_on may = columns other 'name'          #dependents on dataframe, requirement can          #ignored in minimal example.      else:         final_data=pd.merge(final_data, v, how='outer', left_on='name',                              right_on='name', suffixes=('', '_%s'%(i+1)))    

result:

  name  value_0  value_1  value  value_3 0           1        3    nan      nan 1    c        3      nan    nan      nan 2    e        4        4      3      nan 3    d      nan        3      1        0 4    f      nan      nan      5        3 5    g      nan      nan    nan        4  [6 rows x 5 columns] 

it works, anyway can done without loop?

also, why column name of second last column not value_2?


p.s. know in minimal example, result can achieved by:

pd.concat([item.set_index('name') item in [data1, data2, data3, data4]], axis=1)  

but in real case due way how dataframes constructed , information stored in index columns, not ideal solution without additional tricks. so, let's not consider route.

does make sense merge it, then? what's wrong panel?

> data = [data1, data2, data3, data4] > p = pd.panel(dict(zip(map(str, range(len(data))), data))) > p.to_frame().t major    0          1          2       minor name value name value name value 0            1    c     3    e     4 1            3    d     3    e     4 2        d     1    e     3    f     5 3        d     0    f     3    g     4  # , kicks > p.transpose(2, 0, 1).to_frame().reset_index().pivot_table(values='value', rows='name', cols='major') major   0   1   2   3 name                        1   3 nan nan c       3 nan nan nan d     nan   3   1   0 e       4   4   3 nan f     nan nan   5   3 g     nan nan nan   4 

Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -