gluonts.dataset.pandas module#
- class gluonts.dataset.pandas.PandasDataset(dataframes: typing.Union[pandas.core.frame.DataFrame, pandas.core.series.Series, typing.List[pandas.core.frame.DataFrame], typing.List[pandas.core.series.Series], typing.Dict[str, pandas.core.frame.DataFrame], typing.Dict[str, pandas.core.series.Series]], target: typing.Union[str, typing.List[str]] = 'target', timestamp: typing.Optional[str] = None, freq: typing.Optional[str] = None, feat_dynamic_real: typing.List[str] = <factory>, feat_dynamic_cat: typing.List[str] = <factory>, feat_static_real: typing.List[str] = <factory>, feat_static_cat: typing.List[str] = <factory>, past_feat_dynamic_real: typing.List[str] = <factory>, ignore_last_n_targets: int = 0)[source]#
Bases:
object
A pandas.DataFrame-based dataset type.
This class is constructed with a collection of pandas.DataFrame-objects where each
DataFrame
is representing one time series. Atarget
and atimestamp
columns are essential. Furthermore, static/dynamic real/categorical features can be specified.- Parameters
dataframes (Union[pandas.core.frame.DataFrame, pandas.core.series.Series, List[pandas.core.frame.DataFrame], List[pandas.core.series.Series], Dict[str, pandas.core.frame.DataFrame], Dict[str, pandas.core.series.Series]]) – Single
pd.DataFrame
/pd.Series
or a collection as list or dict containing at leasttimestamp
andtarget
values. If a Dict is provided, the key will be the associateditem_id
.target (Union[str, List[str]]) – Name of the column that contains the
target
time series. For multivariate targets, a list of column names should be provided.timestamp (Optional[str]) – Name of the column that contains the timestamp information.
freq (Optional[str]) – Frequency of observations in the time series. Must be a valid pandas frequency.
feat_dynamic_real (List[str]) – List of column names that contain dynamic real features.
feat_dynamic_cat (List[str]) – List of column names that contain dynamic categorical features.
feat_static_real (List[str]) – List of column names that contain static real features.
feat_static_cat (List[str]) – List of column names that contain static categorical features.
past_feat_dynamic_real (List[str]) – List of column names that contain dynamic real features only for the history.
ignore_last_n_targets (int) – For target and past dynamic features last
ignore_last_n_targets
elements are removed when iterating over the data set. This becomes important when the predictor is called.
- dataframes: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, List[pandas.core.frame.DataFrame], List[pandas.core.series.Series], Dict[str, pandas.core.frame.DataFrame], Dict[str, pandas.core.series.Series]]#
- feat_dynamic_cat: List[str]#
- feat_dynamic_real: List[str]#
- feat_static_cat: List[str]#
- feat_static_real: List[str]#
- freq: Optional[str] = None#
- classmethod from_long_dataframe(dataframe: pandas.core.frame.DataFrame, item_id: str, **kwargs) gluonts.dataset.pandas.PandasDataset [source]#
Construct
PandasDataset
out of a long dataframe. A long dataframe uses the long format for each variable. Target time series values, for example, are stacked on top of each other rather than side-by-side. The same is true for other dynamic or categorical features.- Parameters
dataframe – pandas.DataFrame containing at least
timestamp
,target
anditem_id
columns.item_id – Name of the column that, when grouped by, gives the different time series.
**kwargs – Additional arguments. Same as of PandasDataset class.
- Returns
Gluonts dataset based on ``pandas.DataFrame``s.
- Return type
- ignore_last_n_targets: int = 0#
- past_feat_dynamic_real: List[str]#
- target: Union[str, List[str]] = 'target'#
- timestamp: Optional[str] = None#
- gluonts.dataset.pandas.as_dataentry(data: pandas.core.frame.DataFrame, target: Union[str, List[str]], timestamp: Optional[str] = None, feat_dynamic_real: List[str] = [], feat_dynamic_cat: List[str] = [], feat_static_real: List[str] = [], feat_static_cat: List[str] = [], past_feat_dynamic_real: List[str] = []) Dict[str, Any] [source]#
Convert a single time series (uni- or multi-variate) that is given in a pandas.DataFrame format to a DataEntry.
- Parameters
data – pandas.DataFrame containing at least
timestamp
,target
anditem_id
columns.target – Name of the column that contains the
target
time series. For multivariate targetstarget
is expecting a list of column names.timestamp – Name of the column that contains the timestamp information. If
None
the index ofdata
is assumed to be the time.feat_dynamic_real – List of column names that contain dynamic real features.
feat_dynamic_cat – List of column names that contain dynamic categorical features.
feat_static_real – List of column names that contain static real features.
feat_static_cat – List of column names that contain static categorical features.
past_feat_dynamic_real – List of column names that contain dynamic real features only for the history.
- Returns
A dictionary with at least
target
andstart
field.- Return type
DataEntry
- gluonts.dataset.pandas.is_series(series: Any) bool [source]#
return True if
series
ispd.Series
or a collection ofpd.Series
.
- gluonts.dataset.pandas.is_uniform(index: pandas.core.indexes.period.PeriodIndex) bool [source]#
Check if
index
contains monotonically increasing periods, evenly spaced with frequencyindex.freq
.>>> ts = ["2021-01-01 00:00", "2021-01-01 02:00", "2021-01-01 04:00"] >>> is_uniform(pd.DatetimeIndex(ts).to_period("2H")) True >>> ts = ["2021-01-01 00:00", "2021-01-01 04:00"] >>> is_uniform(pd.DatetimeIndex(ts).to_period("2H")) False
- gluonts.dataset.pandas.prepare_prediction_data(dataentry: Dict[str, Any], ignore_last_n_targets: int) Dict[str, Any] [source]#
Remove
ignore_last_n_targets
values fromtarget
andpast_feat_dynamic_real
. Works in univariate and multivariate case.>>> prepare_prediction_data( >>> {"target": np.array([1., 2., 3., 4.])}, ignore_last_n_targets=2 >>> ) {'target': array([1., 2.])}