gluonts.dataset.split.splitter module¶
Train/test splitter¶
This module defines strategies to split a whole dataset into train and test subsets.
For uniform datasets, where all time-series start and end at the same point in time OffsetSplitter can be used:
splitter = OffsetSplitter(prediction_length=24, split_offset=24)
train, test = splitter.split(whole_dataset)
For all other datasets, the more flexible DateSplitter can be used:
splitter = DateSplitter(
prediction_length=24,
split_date=pd.Timestamp('2018-01-31', freq='D')
)
train, test = splitter.split(whole_dataset)
The module also supports rolling splits:
splitter = DateSplitter(
prediction_length=24,
split_date=pd.Timestamp('2018-01-31', freq='D')
)
train, test = splitter.rolling_split(whole_dataset, windows=7)
-
class
gluonts.dataset.split.splitter.
AbstractBaseSplitter
[source]¶ Bases:
abc.ABC
Base class for all other splitter.
- Parameters
prediction_length (param) – The prediction length which is used to train themodel.
max_history – If given, all entries in the test-set have a max-length of max_history. This can be used to produce smaller file-sizes.
-
class
gluonts.dataset.split.splitter.
DateSplitter
[source]¶ Bases:
gluonts.dataset.split.splitter.AbstractBaseSplitter
,pydantic.main.BaseModel
-
max_history
: Optional[int] = None¶
-
prediction_length
: int = None¶
-
split_date
: pd.Timestamp = None¶
-
-
class
gluonts.dataset.split.splitter.
OffsetSplitter
[source]¶ Bases:
pydantic.main.BaseModel
,gluonts.dataset.split.splitter.AbstractBaseSplitter
Requires uniform data.
-
max_history
: Optional[int] = None¶
-
prediction_length
: int = None¶
-
split_offset
: int = None¶
-
-
class
gluonts.dataset.split.splitter.
TimeSeriesSlice
[source]¶ Bases:
pydantic.main.BaseModel
Like DataEntry, but all time-related fields are of type pd.Series and is indexable, e.g ts_slice[‘2018’:].
-
property
end
¶
-
feat_dynamic_cat
: List[pd.Series] = None¶
-
feat_dynamic_real
: List[pd.Series] = None¶
-
feat_static_cat
: List[int] = None¶
-
feat_static_real
: List[float] = None¶
-
classmethod
from_data_entry
(item: Dict[str, Any], freq: Optional[str] = None) → gluonts.dataset.split.splitter.TimeSeriesSlice[source]¶
-
item
: str = None¶
-
property
start
¶
-
target
: pd.Series = None¶
-
property