TreeFeatureSelectionTransform¶
- class TreeFeatureSelectionTransform(model: Union[sklearn.tree._classes.DecisionTreeRegressor, sklearn.tree._classes.ExtraTreeRegressor, sklearn.ensemble._forest.RandomForestRegressor, sklearn.ensemble._forest.ExtraTreesRegressor, sklearn.ensemble._gb.GradientBoostingRegressor, catboost.core.CatBoostRegressor], top_k: int, features_to_use: Union[List[str], Literal['all']] = 'all')[source]¶
Bases:
etna.transforms.feature_selection.base.BaseFeatureSelectionTransform
Transform that selects features according to tree-based models feature importance.
Notes
Transform works with any type of features, however most of the models works only with regressors. Therefore, it is recommended to pass the regressors into the feature selection transforms.
Init TreeFeatureSelectionTransform.
- Parameters
model (Union[sklearn.tree._classes.DecisionTreeRegressor, sklearn.tree._classes.ExtraTreeRegressor, sklearn.ensemble._forest.RandomForestRegressor, sklearn.ensemble._forest.ExtraTreesRegressor, sklearn.ensemble._gb.GradientBoostingRegressor, catboost.core.CatBoostRegressor]) – model to make selection, it should have
feature_importances_
property (e.g. all tree-based regressors in sklearn)top_k (int) – num of features to select; if there are not enough features, then all will be selected
features_to_use (Union[List[str], Literal['all']]) – columns of the dataset to select from; if “all” value is given, all columns are used
- Inherited-members
Methods
fit
(df)Fit the model and remember features to select.
fit_transform
(df)May be reimplemented.
inverse_transform
(df)Inverse transforms dataframe.
transform
(df)Select top_k features.
- fit(df: pandas.core.frame.DataFrame) etna.transforms.feature_selection.feature_importance.TreeFeatureSelectionTransform [source]¶
Fit the model and remember features to select.
- Parameters
df (pandas.core.frame.DataFrame) – dataframe with all segments data
- Returns
result – instance after fitting
- Return type