i have dataframe 2 columns , 3 level index structure. columns price , volume, , indexes trader - stock - day.
i compute rolling mean of price , volume on last 50 days each trader - stock combination in data.
this came far.
test=test.set_index(['date','trader', 'stock'])
test=test.unstack().unstack()
test=test.resample("1d")
test=test.fillna(0)
test[[col+'_norm' col in test.columns]]=test.apply(lambda x: pd.rolling_mean(x,50,50))
test.stack().stack().reset_index().set_index(['trader', 'stock','date']).sort_index().head()
that is, unstack dataset twice have time axis left, , can compute 50 days rolling mean of variables because 50 observations correspond 50 days (after having resampled data).
the problem dont know how create right names rolling mean variables
test[[col+'_norm' col in test.columns]]
typeerror: can concatenate tuple (not "str") tuple
any ideas wrong here? algorithm correct these rolling means? many thanks!
the result of pd.rolling_mean
(with modified column names) can concatenated original dataframe:
means = pd.rolling_mean(test, 50, 50) means.columns = [('{}_norm'.format(col[0]),)+col[1:] col in means.columns] test = pd.concat([test, means], axis=1)
import numpy np import pandas pd n = 10 test = pd.dataframe(np.random.randint(4, size=(n, 3)), columns=['trader', 'stock', 'foo'], index=pd.date_range('2000-1-1', periods=n)) test.index.names = ['date'] test = test.set_index(['trader', 'stock'], append=true) test = test.unstack().unstack() test = test.resample("1d") test = test.fillna(0) means = pd.rolling_mean(test, 50, 50) means.columns = [('{}_norm'.format(col[0]),)+col[1:] col in means.columns] test = pd.concat([test, means], axis=1) test = test.stack().stack() test = test.reorder_levels(['trader', 'stock', 'date']) test = test.sort_index() print(test.head())
yields
foo foo_norm trader stock date 0 0 2000-01-01 0 nan 2000-01-02 0 nan 2000-01-03 0 nan 2000-01-04 0 nan 2000-01-05 0 nan ...
Comments
Post a Comment