python - Efficiently Labeling Variable Values in Pandas -
i have dataframe variables coded integers, i'd replace actual value labels.
for example, have following dataframe:
>>> df=pd.dataframe([[1,3],[2,2],[3,2]], columns=['q1','q2']) >>> df q1 q2 0 1 3 1 2 2 2 3 2
if, numbers 1,2,3 represented same value in both columns, have dictionary looked this:
labels={1:'yes',2:'no',3:'unsure'}
and recode applymap:
>>> df.applymap(labels.get) q1 q2 0 yes unsure 1 no no 2 unsure no
however, integers code different label in each column. example, dictionary of value labels may this:
labels2={'q1':{1:'yes',2:'no',3:'unsure'}, 'q2':{1:'very', 2:'a little', 3:'not @ all'}}
what efficient way of recoding values in scenario?
i using apply , loop (see below), pretty clunky. there better way?
>>> import pandas pd >>> dfs=[] >>> question in labels2: ... d=df[question].map(labels2[question].get) ... dfs.append(d) ... >>> pd.concat(dfs,1) q1 q2 0 yes not @ 1 no little 2 unsure little
you can use apply
, use column's name
attribute key outer dictionary:
>>> df.apply(lambda col: col.map(labels2[col.name])) q1 q2 0 yes not @ 1 no little 2 unsure little
Comments
Post a Comment