python - Know feature names after imputation -
i run sk-learn classifier on pandas dataframe (x). since data missing, use sk-learn's imputer this:
imp=imputer(strategy='mean',axis=0) x=imp.fit_transform(x)
after doing however, number of features decreased, presumably because imputer gets rids of empty columns.
that's fine, except imputer transforms dataframe numpy ndarray, , lose column/feature names. need them later on identify important features (with clf.feature_importances_
).
how can know names of features in clf.feature_importances_, if of columns of initial dataframe have been dropped imputer?
you can this:
invalid_mask = np.isnan(imp.statistics_) valid_mask = np.logical_not(invalid_mask) valid_idx, = np.where(valid_mask)
now have old indexes (indexes these columns had in matrix x) valid columns. can feature names these indexes list of feature names of old x.
Comments
Post a Comment