StackingEnsemble

class StackingEnsemble(pipelines: List[etna.pipeline.base.BasePipeline], final_model: sklearn.base.RegressorMixin = LinearRegression(), n_folds: int = 3, features_to_use: Union[None, Literal['all'], List[str]] = None, n_jobs: int = 1, joblib_params: Optional[Dict[str, Any]] = None)[source]

Bases: etna.pipeline.base.BasePipeline, etna.ensembles.base.EnsembleMixin

StackingEnsemble is a pipeline that forecast future using the metamodel to combine the forecasts of the base models.

Examples

>>> from etna.datasets import generate_ar_df
>>> from etna.datasets import TSDataset
>>> from etna.ensembles import VotingEnsemble
>>> from etna.models import NaiveModel
>>> from etna.models import MovingAverageModel
>>> from etna.pipeline import Pipeline
>>> import pandas as pd
>>> pd.options.display.float_format = '{:,.2f}'.format
>>> df = generate_ar_df(periods=100, start_time="2021-06-01", ar_coef=[0.8], n_segments=3)
>>> df_ts_format = TSDataset.to_dataset(df)
>>> ts = TSDataset(df_ts_format, "D")
>>> ma_pipeline = Pipeline(model=MovingAverageModel(window=5), transforms=[], horizon=7)
>>> naive_pipeline = Pipeline(model=NaiveModel(lag=10), transforms=[], horizon=7)
>>> ensemble = StackingEnsemble(pipelines=[ma_pipeline, naive_pipeline])
>>> _ = ensemble.fit(ts=ts)
>>> forecast = ensemble.forecast()
>>> forecast[:,:,"target"]
segment    segment_0 segment_1 segment_2
feature       target    target    target
timestamp
2021-09-09      0.70      1.47      0.20
2021-09-10      0.62      1.53      0.26
2021-09-11      0.50      1.78      0.36
2021-09-12      0.37      1.88      0.21
2021-09-13      0.46      1.87      0.25
2021-09-14      0.44      1.49      0.21
2021-09-15      0.36      1.56      0.30

Init StackingEnsemble.

Parameters
  • pipelines (List[etna.pipeline.base.BasePipeline]) – List of pipelines that should be used in ensemble.

  • final_model (sklearn.base.RegressorMixin) – Regression model with fit/predict interface which will be used to combine the base estimators.

  • n_folds (int) – Number of folds to use in the backtest. Backtest is not used for model evaluation but for prediction.

  • features_to_use (Union[None, Literal['all'], typing.List[str]]) – Features except the forecasts of the base models to use in the final_model.

  • n_jobs (int) – Number of jobs to run in parallel.

  • joblib_params (Optional[Dict[str, Any]]) – Additional parameters for joblib.Parallel.

Raises

ValueError: – If the number of the pipelines is less than 2 or pipelines have different horizons.

Inherited-members

Methods

backtest(ts, metrics[, n_folds, mode, ...])

Run backtest with the pipeline.

fit(ts)

Fit the ensemble.

forecast([prediction_interval, quantiles, ...])

Make predictions.

fit(ts: etna.datasets.tsdataset.TSDataset) etna.ensembles.stacking_ensemble.StackingEnsemble[source]

Fit the ensemble.

Parameters

ts (etna.datasets.tsdataset.TSDataset) – TSDataset to fit ensemble.

Returns

Fitted ensemble.

Return type

self