load_time_series_segmentation_benchmark¶

load_time_series_segmentation_benchmark(extract_path: PathLike | None = None, return_metadata: bool = False) → tuple[list[ndarray], list[ndarray]] | tuple[list[ndarray], list[ndarray], list[tuple[str, int]]][source]¶

Load the Time Series Segmentation Benchmark (TSSB).

This function loads the Time Series Segmentation Benchmark (TSSB) into memory, downloading from GitHub (https://github.com/ermshaua/time-series-segmentation -benchmark) [1] if the data is not available at the specified extract_path. The benchmark contains 75 annotated TS with 1-9 segments. Each TS is constructed from one of the UEA & UCR time series classification datasets. TS are grouped by label and concatenated to create segments with distinctive temporal patterns and statistical properties. Offsets at which segments change are annotated as CPs. Addtionally, resampling is applied to control the data resolution. Approximate, hand-selected window sizes are provided that capture temporal patterns.

If you do not specify extract_path, it will set the path to aeon/datasets/local_data. If the problem is not present in extract_path, it will attempt to download the data.

Parameters:

extract_pathstr, default=None: The path to look for the data. If no path is provided, the function looks in aeon/datasets/local_data/. If a path is given, it can be an absolute, e.g., C:/Temp/ or relative, e.g. Temp/ or ./Temp/, path to an existing CSV-file.
return_metadataboolean, default = False: If True, returns a tuple (X, y, metadata).

Returns:

X: list of np.ndarray: The list of univariate (1d) time series with variable shape (n_instances,).
y: list of np.ndarray: The list of change points for every time series.
metadata: optional: The list of tuples containing data set names and window sizes

Raises:

URLError or HTTPError: If the GitHub repository is not accessible.

References

[1]

Arik Ermshaus, Patrick Schäfer, Ulf Leser: ClaSP: parameter-free time series segmentation. Data Mining and Knowledge Discovery, 2023, DOI:10.1007/s10618-023-00923-x.

Examples

>>> from aeon.datasets import load_time_series_segmentation_benchmark
>>> X, y = load_time_series_segmentation_benchmark()
... )