SeriesSearch¶

class SeriesSearch(k: int = 1, threshold: float = inf, distance: str = 'euclidean', distance_args: dict | None = None, inverse_distance: bool = False, normalise: bool = False, speed_up: str = 'fastest', n_jobs: int = 1)[source]¶

Series search estimator.

The series search estimator will return a set of matches for each subsequence of size L in a time series given during predict. The matching of each subsequence will be made against all subsequence of size L inside the time series given during fit, which will represent the search space.

Depending on the k and/or threshold parameters, which condition what is considered a valid match during the search, the number of matches will vary. If k is used, at most k matches (the k best) will be returned, if threshold is used and k is set to np.inf, all the candidates which distance to the query is inferior or equal to threshold will be returned. If both are used, the k best matches to the query with distance inferior to threshold will be returned.

Parameters:

kint, default=1: The number of best matches to return during predict for each subsequence.
thresholdfloat, default=np.inf: The number of best matches to return during predict for each subsequence.
distancestr, default=”euclidean”: Name of the distance function to use. A list of valid strings can be found in the documentation for aeon.distances.get_distance_function. If a callable is passed it must either be a python function or numba function with nopython=True, that takes two 1d numpy arrays as input and returns a float.
distance_argsdict, default=None: Optional keyword arguments for the distance function.
normalisebool, default=False: Whether the distance function should be z-normalised.
speed_upstr, default=’fastest’: Which speed up technique to use with for the selected distance function. By default, the fastest algorithm is used. A list of available algorithm for each distance can be obtained by calling the get_speedup_function_names function.
inverse_distancebool, default=False: If True, the matching will be made on the inverse of the distance, and thus, the worst matches to the query will be returned instead of the best ones.
n_jobsint, default=1: Number of parallel jobs to use.

Attributes:

X_array, shape (n_cases, n_channels, n_timepoints): The input time series stored during the fit method. This is the database we search in when given a query.
distance_profile_functionfunction: The function used to compute the distance profile. This is determined during the fit method based on the distance and normalise parameters.

Notes

Capabilities ¶
Missing Values	No
Multithreading	Yes
Univariate	Yes
Multivariate	Yes
Unequal Length	Yes

For now, the multivariate case is only treated as independent. Distances are computed for each channel independently and then summed together.

Methods

`clone`([random_state])	Obtain a clone of the object with the same hyperparameters.
`fit`(X[, y])	Fit method: data preprocessing and storage.
`get_class_tag`(tag_name[, raise_error, ...])	Get tag value from estimator class (only class tags).
`get_class_tags`()	Get class tags from estimator class and all its parent classes.
`get_fitted_params`([deep])	Get fitted parameters.
`get_metadata_routing`()	Sklearn metadata routing.
`get_params`([deep])	Get parameters for this estimator.
`get_speedup_function_names`()	Get available speedup for series search in aeon.
`get_tag`(tag_name[, raise_error, ...])	Get tag value from estimator class.
`get_tags`()	Get tags from estimator.
`predict`(X, length[, axis, X_index, ...])	Predict method : Check the shape of X and call _predict to perform the search.
`reset`([keep])	Reset the object to a clean post-init state.
`set_params`(**params)	Set the parameters of this estimator.
`set_tags`(**tag_dict)	Set dynamic tags to given values.

final predict(X: ndarray, length: int, axis: int = 1, X_index=None, exclusion_factor=2.0, apply_exclusion_to_result=False)[source]¶

Predict method : Check the shape of X and call _predict to perform the search.

If the distance profile function is normalised, it stores the mean and stds from X and X_, with X_ the training data.

Parameters:

Xnp.ndarray, 2D array of shape (n_channels, series_length): Input time series used for the search.
lengthint: The length parameter that will be used to extract queries from X.
axisint: The time point axis of the input series if it is 2D. If axis==0, it is assumed each column is a time series and each row is a time point. i.e. the shape of the data is (n_timepoints,n_channels). axis==1 indicates the time series are in rows, i.e. the shape of the data is (n_channels,n_timepoints).
X_indexint: An integer indicating if X was extracted is part of the dataset that was given during the fit method. If so, this integer should be the sample id. The search will define an exclusion zone for the queries extarcted from X in order to avoid matching with themself. If None, it is considered that the query is not extracted from X_.
exclusion_factorfloat, default=2.: The factor to apply to the query length to define the exclusion zone. The exclusion zone is define from id_timestamp - query_length//exclusion_factor to id_timestamp + query_length//exclusion_factor. This also applies to the matching conditions defined by child classes. For example, with TopKSimilaritySearch, the k best matches are also subject to the exclusion zone, but with \(id_timestamp\) the index of one of the k matches.
apply_exclusion_to_resultbool, default=False: Wheter to apply the exclusion factor to the output of the similarity search. This means that two matches of the query from the same sample must be at least spaced by +/- query_length//exclusion_factor. This can avoid pathological matching where, for example if we extract the best two matches, there is a high chance that if the best match is located at id_timestamp, the second best match will be located at id_timestamp +/- 1, as they both share all their values except one.

Returns:

Tuple(ndarray, ndarray): The first array, of shape (series_length - length + 1, n_matches), contains the distance between all the queries of size length and their best matches in X_. The second array, of shape (series_length - L + 1, n_matches, 2), contains the indexes of these matches as (id_sample, id_timepoint). The corresponding match can be retrieved as X_[id_sample, :, id_timepoint : id_timepoint + length].

Raises:

TypeError: If the input X array is not 2D raise an error.
ValueError: If the length of the query is greater

classmethod get_speedup_function_names()[source]¶

Get available speedup for series search in aeon.

The returned structure is a dictionnary that contains the names of all avaialble speedups for normalised and non-normalised distance functions.

Returns:

dict: The available speedups name that can be used as parameters in similarity search classes.

clone(random_state=None)[source]¶

Obtain a clone of the object with the same hyperparameters.

A clone is a different object without shared references, in post-init state. This function is equivalent to returning sklearn.clone of self. Equal in value to type(self)(**self.get_params(deep=False)).

Parameters:

random_stateint, RandomState instance, or None, default=None: Sets the random state of the clone. If None, the random state is not set. If int, random_state is the seed used by the random number generator. If RandomState instance, random_state is the random number generator.

Returns:

estimatorobject: Instance of type(self), clone of self (see above)

fit(X: ndarray, y=None)[source]¶

Fit method: data preprocessing and storage.

Parameters:

Xnp.ndarray, 3D array of shape (n_cases, n_channels, n_timepoints): Input array to be used as database for the similarity search
yoptional: Not used.

Returns:

self

Raises:

TypeError: If the input X array is not 3D raise an error.

classmethod get_class_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶

Get tag value from estimator class (only class tags).

Parameters:

tag_namestr: Name of tag value.
raise_errorbool, default=True: Whether a ValueError is raised when the tag is not found.
tag_value_defaultany type, default=None: Default/fallback value if tag is not found and error is not raised.

Returns:

tag_value: Value of the tag_name tag in cls. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises:

ValueError: if raise_error is True and tag_name is not in self.get_tags().keys()

Examples

>>> from aeon.classification import DummyClassifier
>>> DummyClassifier.get_class_tag("capability:multivariate")
True

classmethod get_class_tags()[source]¶

Get class tags from estimator class and all its parent classes.

Returns:

collected_tagsdict: Dictionary of tag name and tag value pairs. Collected from _tags class attribute via nested inheritance. These are not overridden by dynamic tags set by set_tags or class __init__ calls.

get_fitted_params(deep=True)[source]¶

Get fitted parameters.

State required:: Requires state to be “fitted”.

Parameters:

deepbool, default=True: If True, will return the fitted parameters for this estimator and contained subobjects that are estimators.

Returns:

fitted_paramsdict: Fitted parameter names mapped to their values.

get_metadata_routing()[source]¶

Sklearn metadata routing.

Not supported by aeon estimators.

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

get_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶

Get tag value from estimator class.

Includes dynamic and overridden tags.

Parameters:

tag_namestr: Name of tag to be retrieved.
raise_errorbool, default=True: Whether a ValueError is raised when the tag is not found.
tag_value_defaultany type, default=None: Default/fallback value if tag is not found and error is not raised.

Returns:

tag_value: Value of the tag_name tag in self. If not found, returns an error if raise_error is True, otherwise it returns tag_value_default.

Raises:

ValueError: if raise_error is True and tag_name is not in self.get_tags().keys()

Examples

>>> from aeon.classification import DummyClassifier
>>> d = DummyClassifier()
>>> d.get_tag("capability:multivariate")
True

get_tags()[source]¶

Get tags from estimator.

Includes dynamic and overridden tags.

Returns:

collected_tagsdict: Dictionary of tag name and tag value pairs. Collected from _tags class attribute via nested inheritance and then any overridden and new tags from __init__ or set_tags.

reset(keep=None)[source]¶

Reset the object to a clean post-init state.

After a self.reset() call, self is equal or similar in value to type(self)(**self.get_params(deep=False)), assuming no other attributes were kept using keep.

Detailed behaviour:

removes any object attributes, except:: hyper-parameters (arguments of __init__) object attributes containing double-underscores, i.e., the string “__”

runs __init__ with current values of hyperparameters (result of get_params)

Not affected by the reset are:

object attributes containing double-underscores class and object methods, class attributes any attributes specified in the keep argument

Parameters:

keepNone, str, or list of str, default=None: If None, all attributes are removed except hyperparameters. If str, only the attribute with this name is kept. If list of str, only the attributes with these names are kept.

Returns:

selfobject: Reference to self.

set_params(**params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

set_tags(**tag_dict)[source]¶

Set dynamic tags to given values.

Parameters:

**tag_dictdict: Dictionary of tag name and tag value pairs.

Returns:

selfobject: Reference to self.