gluonts.dataset.common module

class gluonts.dataset.common.BasicFeatureInfo[source]

Bases: pydantic.main.BaseModel

name: str = None
class gluonts.dataset.common.CategoricalFeatureInfo[source]

Bases: pydantic.main.BaseModel

cardinality: str = None
name: str = None
class gluonts.dataset.common.Channel[source]

Bases: pydantic.main.BaseModel

get_datasets() → gluonts.dataset.common.TrainDatasets[source]
metadata: Path = None
test: Optional[Path] = None
train: Path = None
class gluonts.dataset.common.FileDataset(path: pathlib.Path, freq: str, one_dim_target: bool = True, cache: bool = False)[source]

Bases: typing.Iterable

Dataset that loads JSON Lines files contained in a path.

Parameters
  • path – Path containing the dataset files. Each file is considered and should be valid to the exception of files starting with ‘.’ or ending with ‘_SUCCESS’. A valid line in a file can be for instance: {“start”: “2014-09-07”, “target”: [0.1, 0.2]}.

  • freq – Frequency of the observation in the time series. Must be a valid Pandas frequency.

  • one_dim_target – Whether to accept only univariate target time series.

  • cache – Indicates whether the dataset should be cached or not.

files() → List[pathlib.Path][source]

List the files that compose the dataset.

Returns

List of the paths of all files composing the dataset.

Return type

List[Path]

classmethod is_valid(path: pathlib.Path) → bool[source]
class gluonts.dataset.common.ListDataset(data_iter: Iterable[Dict[str, Any]], freq: str, one_dim_target: bool = True)[source]

Bases: typing.Iterable

Dataset backed directly by an list of dictionaries.

data_iter

Iterable object yielding all items in the dataset. Each item should be a dictionary mapping strings to values. For instance: {“start”: “2014-09-07”, “target”: [0.1, 0.2]}.

freq

Frequency of the observation in the time series. Must be a valid Pandas frequency.

one_dim_target

Whether to accept only univariate target time series.

class gluonts.dataset.common.MetaData[source]

Bases: pydantic.main.BaseModel

class Config[source]

Bases: pydantic.main.BaseConfig

allow_population_by_field_name = True
feat_dynamic_cat: List[CategoricalFeatureInfo] = None
feat_dynamic_real: List[BasicFeatureInfo] = None
feat_static_cat: List[CategoricalFeatureInfo] = None
feat_static_real: List[BasicFeatureInfo] = None
freq: str = None
prediction_length: Optional[int] = None
target: Optional[BasicFeatureInfo] = None
class gluonts.dataset.common.ProcessDataEntry(freq: str, one_dim_target: bool = True)[source]

Bases: object

class gluonts.dataset.common.ProcessStartField[source]

Bases: pydantic.main.BaseModel

Transform the start field into a Timestamp with the given frequency.

Parameters
  • name – Name of the field to transform.

  • freq – Frequency to use. This must be a valid Pandas frequency string.

class Config[source]

Bases: object

arbitrary_types_allowed = True
freq: Union[str, pd.DateOffset] = None
name: str = None
process[source]

Create timestamp and align it according to frequency.

tz_strategy: TimeZoneStrategy = None
class gluonts.dataset.common.ProcessTimeSeriesField(name, is_required: bool, is_static: bool, is_cat: bool)[source]

Bases: object

Converts a time series field identified by name from a list of numbers into a numpy array.

Constructor parameters modify the conversion logic in the following way:

If is_required=True, throws a GluonTSDataError if the field is not present in the Data dictionary.

If is_cat=True, the array type is np.int32, otherwise it is np.float32.

If is_static=True, asserts that the resulting array is 1D, otherwise asserts that the resulting array is 2D. 2D dynamic arrays of shape (T) are automatically expanded to shape (1,T).

Parameters
  • name – Name of the field to process.

  • is_required – Whether the field must be present.

  • is_cat – Whether the field refers to categorical (i.e. integer) values.

  • is_static – Whether the field is supposed to have a time dimension.

class gluonts.dataset.common.SourceContext(source, row)[source]

Bases: tuple

property row

Alias for field number 1

property source

Alias for field number 0

class gluonts.dataset.common.TimeSeriesItem[source]

Bases: pydantic.main.BaseModel

class Config[source]

Bases: object

arbitrary_types_allowed = True
json_encoders = {<class 'numpy.ndarray'>: <method 'tolist' of 'numpy.ndarray' objects>}
feat_dynamic_cat: List[List[int]] = None
feat_dynamic_real: List[List[float]] = None
feat_static_cat: List[int] = None
feat_static_real: List[float] = None
gluontsify(metadata: gluonts.dataset.common.MetaData) → dict[source]
item: Optional[str] = None
metadata: dict = None
start: Timestamp = None
target: np.ndarray = None
classmethod validate_target(v)[source]
class gluonts.dataset.common.TimeZoneStrategy[source]

Bases: enum.Enum

An enumeration.

error = 'error'
ignore = 'ignore'
utc = 'utc'
class gluonts.dataset.common.Timestamp[source]

Bases: pandas._libs.tslibs.timestamps.Timestamp

class gluonts.dataset.common.TrainDatasets[source]

Bases: tuple

A dataset containing two subsets, one to be used for training purposes, and the other for testing purposes, as well as metadata.

property metadata

Alias for field number 0

property test

Alias for field number 2

property train

Alias for field number 1

gluonts.dataset.common.load_datasets(metadata: pathlib.Path, train: pathlib.Path, test: Optional[pathlib.Path]) → gluonts.dataset.common.TrainDatasets[source]

Loads a dataset given metadata, train and test path.

Parameters
  • metadata – Path to the metadata file

  • train – Path to the training dataset files.

  • test – Path to the test dataset files.

Returns

An object collecting metadata, training data, test data.

Return type

TrainDatasets

gluonts.dataset.common.save_datasets(dataset: gluonts.dataset.common.TrainDatasets, path_str: str, overwrite=True) → None[source]

Saves an TrainDatasets object to a JSON Lines file.

Parameters
  • dataset – The training datasets.

  • path_str – Where to save the dataset.

  • overwrite – Whether to delete previous version in this folder.

gluonts.dataset.common.serialize_data_entry(data)[source]

Encode the numpy values in the a DataEntry dictionary into lists so the dictionary can be JSON serialized.

Parameters

data – The dictionary to be transformed.

Returns

The transformed dictionary, where all fields where transformed into strings.

Return type

Dict