gluonts.dataset.util module

class gluonts.dataset.util.DataLoadingBounds(lower, upper)[source]

Bases: tuple

property lower

Alias for field number 0

property upper

Alias for field number 1

class gluonts.dataset.util.MPWorkerInfo[source]

Bases: object

Contains the current worker information.

num_workers = 1
classmethod set_worker_info(num_workers: int, worker_id: int, worker_process: bool)[source]
worker_id = 0
worker_process = False
gluonts.dataset.util.batcher(iterable: Iterable[T], batch_size: int) → Iterator[List[T]][source]

Groups elements from iterable into batches of size batch_size.

>>> list(batcher("ABCDEFG", 3))
[['A', 'B', 'C'], ['D', 'E', 'F'], ['G']]

Unlike the grouper proposed in the documentation of itertools, batcher doesn’t fill up missing values.

gluonts.dataset.util.cycle(it)[source]

Like itertools.cycle, but does not store the data.

gluonts.dataset.util.dct_reduce(reduce_fn, dcts)[source]

Similar to reduce, but applies reduce_fn to fields of dicts with the same name.

>>> dct_reduce(sum, [{"a": 1}, {"a": 2}])
{'a': 3}
gluonts.dataset.util.find_files(data_dir: pathlib.Path, predicate: Callable[pathlib.Path, bool] = <function true_predicate>) → List[pathlib.Path][source]
gluonts.dataset.util.get_bounds_for_mp_data_loading(dataset_len: int) → gluonts.dataset.util.DataLoadingBounds[source]

Utility function that returns the bounds for which part of the dataset should be loaded in this worker.

gluonts.dataset.util.shuffler(stream: Iterable[T], batch_size: int) → Iterator[T][source]

Modifies a stream by shuffling items in windows.

It continously takes batch_size-elements from the stream and yields elements from each batch in random order.

gluonts.dataset.util.take(iterable: Iterable[T], n: int) → Iterator[T][source]

Returns up to n elements from iterable.

This is similar to xs[:n], except that it works on Iterable`s and possibly consumes the given `iterable.

>>> list(take(range(10), 5))
[0, 1, 2, 3, 4]
gluonts.dataset.util.to_pandas(instance: dict, freq: str = None) → pandas.core.series.Series[source]

Transform a dictionary into a pandas.Series object, using its “start” and “target” fields.

Parameters
  • instance – Dictionary containing the time series data.

  • freq – Frequency to use in the pandas.Series index.

Returns

Pandas time series object.

Return type

pandas.Series

gluonts.dataset.util.true_predicate(*args) → bool[source]