gluonts.dataset.pandas module#

class gluonts.dataset.pandas.PandasDataset(dataframes: typing.Union[pandas.core.frame.DataFrame, pandas.core.series.Series, typing.List[pandas.core.frame.DataFrame], typing.List[pandas.core.series.Series], typing.Dict[str, pandas.core.frame.DataFrame], typing.Dict[str, pandas.core.series.Series]], target: typing.Union[str, typing.List[str]] = 'target', timestamp: typing.Optional[str] = None, freq: typing.Optional[str] = None, feat_dynamic_real: typing.List[str] = <factory>, feat_dynamic_cat: typing.List[str] = <factory>, feat_static_real: typing.List[str] = <factory>, feat_static_cat: typing.List[str] = <factory>, past_feat_dynamic_real: typing.List[str] = <factory>, ignore_last_n_targets: int = 0)[source]#

Bases: object

A pandas.DataFrame-based dataset type.

This class is constructed with a collection of pandas.DataFrame-objects where each DataFrame is representing one time series. A target and a timestamp columns are essential. Furthermore, static/dynamic real/categorical features can be specified.

Parameters
  • dataframes (Union[pandas.core.frame.DataFrame, pandas.core.series.Series, List[pandas.core.frame.DataFrame], List[pandas.core.series.Series], Dict[str, pandas.core.frame.DataFrame], Dict[str, pandas.core.series.Series]]) – Single pd.DataFrame/pd.Series or a collection as list or dict containing at least timestamp and target values. If a Dict is provided, the key will be the associated item_id.

  • target (Union[str, List[str]]) – Name of the column that contains the target time series. For multivariate targets, a list of column names should be provided.

  • timestamp (Optional[str]) – Name of the column that contains the timestamp information.

  • freq (Optional[str]) – Frequency of observations in the time series. Must be a valid pandas frequency.

  • feat_dynamic_real (List[str]) – List of column names that contain dynamic real features.

  • feat_dynamic_cat (List[str]) – List of column names that contain dynamic categorical features.

  • feat_static_real (List[str]) – List of column names that contain static real features.

  • feat_static_cat (List[str]) – List of column names that contain static categorical features.

  • past_feat_dynamic_real (List[str]) – List of column names that contain dynamic real features only for the history.

  • ignore_last_n_targets (int) – For target and past dynamic features last ignore_last_n_targets elements are removed when iterating over the data set. This becomes important when the predictor is called.

dataframes: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, List[pandas.core.frame.DataFrame], List[pandas.core.series.Series], Dict[str, pandas.core.frame.DataFrame], Dict[str, pandas.core.series.Series]]#
feat_dynamic_cat: List[str]#
feat_dynamic_real: List[str]#
feat_static_cat: List[str]#
feat_static_real: List[str]#
freq: Optional[str] = None#
classmethod from_long_dataframe(dataframe: pandas.core.frame.DataFrame, item_id: str, **kwargs) gluonts.dataset.pandas.PandasDataset[source]#

Construct PandasDataset out of a long dataframe. A long dataframe uses the long format for each variable. Target time series values, for example, are stacked on top of each other rather than side-by-side. The same is true for other dynamic or categorical features.

Parameters
  • dataframe – pandas.DataFrame containing at least timestamp, target and item_id columns.

  • item_id – Name of the column that, when grouped by, gives the different time series.

  • **kwargs – Additional arguments. Same as of PandasDataset class.

Returns

Gluonts dataset based on ``pandas.DataFrame``s.

Return type

PandasDataset

ignore_last_n_targets: int = 0#
past_feat_dynamic_real: List[str]#
target: Union[str, List[str]] = 'target'#
timestamp: Optional[str] = None#
gluonts.dataset.pandas.as_dataentry(data: pandas.core.frame.DataFrame, target: Union[str, List[str]], timestamp: Optional[str] = None, feat_dynamic_real: List[str] = [], feat_dynamic_cat: List[str] = [], feat_static_real: List[str] = [], feat_static_cat: List[str] = [], past_feat_dynamic_real: List[str] = []) Dict[str, Any][source]#

Convert a single time series (uni- or multi-variate) that is given in a pandas.DataFrame format to a DataEntry.

Parameters
  • data – pandas.DataFrame containing at least timestamp, target and item_id columns.

  • target – Name of the column that contains the target time series. For multivariate targets target is expecting a list of column names.

  • timestamp – Name of the column that contains the timestamp information. If None the index of data is assumed to be the time.

  • feat_dynamic_real – List of column names that contain dynamic real features.

  • feat_dynamic_cat – List of column names that contain dynamic categorical features.

  • feat_static_real – List of column names that contain static real features.

  • feat_static_cat – List of column names that contain static categorical features.

  • past_feat_dynamic_real – List of column names that contain dynamic real features only for the history.

Returns

A dictionary with at least target and start field.

Return type

DataEntry

gluonts.dataset.pandas.is_series(series: Any) bool[source]#

return True if series is pd.Series or a collection of pd.Series.

gluonts.dataset.pandas.is_uniform(index: pandas.core.indexes.period.PeriodIndex) bool[source]#

Check if index contains monotonically increasing periods, evenly spaced with frequency index.freq.

>>> ts = ["2021-01-01 00:00", "2021-01-01 02:00", "2021-01-01 04:00"]
>>> is_uniform(pd.DatetimeIndex(ts).to_period("2H"))
True
>>> ts = ["2021-01-01 00:00", "2021-01-01 04:00"]
>>> is_uniform(pd.DatetimeIndex(ts).to_period("2H"))
False
gluonts.dataset.pandas.prepare_prediction_data(dataentry: Dict[str, Any], ignore_last_n_targets: int) Dict[str, Any][source]#

Remove ignore_last_n_targets values from target and past_feat_dynamic_real. Works in univariate and multivariate case.

>>> prepare_prediction_data(
>>>    {"target": np.array([1., 2., 3., 4.])}, ignore_last_n_targets=2
>>> )
{'target': array([1., 2.])}
gluonts.dataset.pandas.series_to_dataframe(series: Union[pandas.core.series.Series, List[pandas.core.series.Series], Dict[str, pandas.core.series.Series]]) Union[pandas.core.frame.DataFrame, List[pandas.core.frame.DataFrame], Dict[str, pandas.core.frame.DataFrame]][source]#