gluonts.evaluation.backtest module

class gluonts.evaluation.backtest.BacktestInformation(train_dataset_stats, test_dataset_stats, estimator, agg_metrics)[source]

Bases: tuple

property agg_metrics

Alias for field number 3

property estimator

Alias for field number 2

static make_from_log(log_file)[source]
static make_from_log_contents(log_contents)[source]
property test_dataset_stats

Alias for field number 1

property train_dataset_stats

Alias for field number 0

gluonts.evaluation.backtest.backtest_metrics(train_dataset: Optional[Iterable[Dict[str, Any]]], test_dataset: Iterable[Dict[str, Any]], forecaster: Union[gluonts.model.estimator.Estimator, gluonts.model.predictor.Predictor], evaluator=<gluonts.evaluation._base.Evaluator object>, num_samples: int = 100, logging_file: Optional[str] = None, use_symbol_block_predictor: Optional[bool] = False, num_workers: Optional[int] = None, num_prefetch: Optional[int] = None, **kwargs)[source]
  • train_dataset – Dataset to use for training.

  • test_dataset – Dataset to use for testing.

  • forecaster – An estimator or a predictor to use for generating predictions.

  • evaluator – Evaluator to use.

  • num_samples – Number of samples to use when generating sample-based forecasts.

  • logging_file – If specified, information of the backtest is redirected to this file.

  • use_symbol_block_predictor – Use a SymbolBlockPredictor during testing.

  • num_workers – The number of multiprocessing workers to use for data preprocessing. By default 0, in which case no multiprocessing will be utilized.

  • num_prefetch – The number of prefetching batches only works if num_workers > 0. If prefetch > 0, it allow worker process to prefetch certain batches before acquiring data from iterators. Note that using large prefetching batch will provide smoother bootstrapping performance, but will consume more shared_memory. Using smaller number may forfeit the purpose of using multiple worker processes, try reduce num_workers in this case. By default it defaults to num_workers * 2.


A tuple of aggregate metrics and per-time-series metrics obtained by training forecaster on train_dataset and evaluating the resulting evaluator provided on the test_dataset.

Return type


gluonts.evaluation.backtest.make_evaluation_predictions(dataset: Iterable[Dict[str, Any]], predictor: gluonts.model.predictor.Predictor, num_samples: int) → Tuple[Iterator[gluonts.model.forecast.Forecast], Iterator[pandas.core.series.Series]][source]

Return predictions on the last portion of predict_length time units of the target. Such portion is cut before making predictions, such a function can be used in evaluations where accuracy is evaluated on the last portion of the target.

  • dataset – Dataset where the evaluation will happen. Only the portion excluding the prediction_length portion is used when making prediction.

  • predictor – Model used to draw predictions.

  • num_samples – Number of samples to draw on the model when evaluating.

gluonts.evaluation.backtest.serialize_message(logger, message: str, variable)[source]