QuerySearch¶
- class QuerySearch(k: int = 1, threshold: float = inf, distance: str = 'euclidean', distance_args: dict | None = None, inverse_distance: bool = False, normalise: bool = False, speed_up: str = 'fastest', n_jobs: int = 1, store_distance_profiles: bool = False)[source]¶
Query search estimator.
The query search estimator will return a set of matches of a query in a search space , which is defined by a time series dataset given during fit. Depending on the k and/or threshold parameters, which condition what is considered a valid match during the search, the number of matches will vary. If k is used, at most k matches (the k best) will be returned, if threshold is used and k is set to np.inf, all the candidates which distance to the query is inferior or equal to threshold will be returned. If both are used, the k best matches to the query with distance inferior to threshold will be returned.
- Parameters:
- kint, default=1
The number of best matches to return during predict for a given query.
- thresholdfloat, default=np.inf
The number of best matches to return during predict for a given query.
- distancestr, default=”euclidean”
Name of the distance function to use. A list of valid strings can be found in the documentation for
aeon.distances.get_distance_function. If a callable is passed it must either be a python function or numba function with nopython=True, that takes two 1d numpy arrays as input and returns a float.- distance_argsdict, default=None
Optional keyword arguments for the distance function.
- normalisebool, default=False
Whether the distance function should be z-normalised.
- speed_upstr, default=’fastest’
Which speed up technique to use with for the selected distance function. By default, the fastest algorithm is used. A list of available algorithm for each distance can be obtained by calling the get_speedup_function_names function.
- inverse_distancebool, default=False
If True, the matching will be made on the inverse of the distance, and thus, the worst matches to the query will be returned instead of the best ones.
- n_jobsint, default=1
Number of parallel jobs to use.
- store_distance_profilesbool, default=False.
Whether to store the computed distance profiles in the attribute “distance_profiles_” after calling the predict method. It will store the raw distance profile, meaning without potential inversion or thresholding applied.
- Attributes:
- X_np.ndarray, 3D array of shape (n_cases, n_channels, n_timepoints)
The input time series stored during the fit method. This is the database we search in when given a query.
- distance_profile_functionfunction
The function used to compute the distance profile. This is determined during the fit method based on the distance and normalise parameters.
Notes
Capabilities ¶ Missing Values
No
Multithreading
Yes
Univariate
Yes
Multivariate
Yes
Unequal Length
Yes
For now, the multivariate case is only treated as independent. Distances are computed for each channel independently and then summed together.
Methods
clone([random_state])Obtain a clone of the object with the same hyperparameters.
fit(X[, y])Fit method: data preprocessing and storage.
get_class_tag(tag_name[, raise_error, ...])Get tag value from estimator class (only class tags).
Get class tags from estimator class and all its parent classes.
get_fitted_params([deep])Get fitted parameters.
Sklearn metadata routing.
get_params([deep])Get parameters for this estimator.
Get available speedup for query search in aeon.
get_tag(tag_name[, raise_error, ...])Get tag value from estimator class.
get_tags()Get tags from estimator.
predict(X[, axis, X_index, ...])Predict method : Check the shape of X and call _predict to perform the search.
reset([keep])Reset the object to a clean post-init state.
set_params(**params)Set the parameters of this estimator.
set_tags(**tag_dict)Set dynamic tags to given values.
- final predict(X: ndarray, axis=1, X_index=None, exclusion_factor=2.0, apply_exclusion_to_result=False) ndarray[source]¶
Predict method : Check the shape of X and call _predict to perform the search.
If the distance profile function is normalised, it stores the mean and stds from X and X_, with X_ the training data.
- Parameters:
- Xnp.ndarray, 2D array of shape (n_channels, query_length)
Input query used for similarity search.
- axisint
The time point axis of the input series if it is 2D. If
axis==0, it is assumed each column is a time series and each row is a time point. i.e. the shape of the data is(n_timepoints,n_channels).axis==1indicates the time series are in rows, i.e. the shape of the data is(n_channels,n_timepoints).- X_indexIterable
An Interable (tuple, list, array) of length two used to specify the index of the query X if it was extracted from the input data X given during the fit method. Given the tuple (id_sample, id_timestamp), the similarity search will define an exclusion zone around the X_index in order to avoid matching X with itself. If None, it is considered that the query is not extracted from X_.
- exclusion_factorfloat, default=2.
The factor to apply to the query length to define the exclusion zone. The exclusion zone is define from \(id_timestamp - query_length//exclusion_factor\) to \(id_timestamp + query_length//exclusion_factor\). This also applies to the matching conditions defined by child classes. For example, with TopKSimilaritySearch, the k best matches are also subject to the exclusion zone, but with \(id_timestamp\) the index of one of the k matches.
- apply_exclusion_to_resultbool, default=False
Wheter to apply the exclusion factor to the output of the similarity search. This means that two matches of the query from the same sample must be at least spaced by +/- \(query_length//exclusion_factor\). This can avoid pathological matching where, for example if we extract the best two matches, there is a high chance that if the best match is located at \(id_timestamp\), the second best match will be located at \(id_timestamp\) +/- 1, as they both share all their values except one.
- Returns:
- Tuple(ndarray, ndarray)
The first array, of shape
(n_matches), contains the distance between the query and its best matches in X_. The second array, of shape(n_matches, 2), contains the indexes of these matches as(id_sample, id_timepoint). The corresponding match can be retrieved asX_[id_sample, :, id_timepoint : id_timepoint + length].
- Raises:
- TypeError
If the input X array is not 2D raise an error.
- ValueError
If the length of the query is greater
- classmethod get_speedup_function_names() dict[source]¶
Get available speedup for query search in aeon.
The returned structure is a dictionnary that contains the names of all avaialble speedups for normalised and non-normalised distance functions.
- Returns:
- dict
The available speedups name that can be used as parameters in similarity search classes.
- clone(random_state=None)[source]¶
Obtain a clone of the object with the same hyperparameters.
A clone is a different object without shared references, in post-init state. This function is equivalent to returning
sklearn.cloneof self. Equal in value totype(self)(**self.get_params(deep=False)).- Parameters:
- random_stateint, RandomState instance, or None, default=None
Sets the random state of the clone. If None, the random state is not set. If int, random_state is the seed used by the random number generator. If RandomState instance, random_state is the random number generator.
- Returns:
- estimatorobject
Instance of
type(self), clone of self (see above)
- fit(X: ndarray, y=None)[source]¶
Fit method: data preprocessing and storage.
- Parameters:
- Xnp.ndarray, 3D array of shape (n_cases, n_channels, n_timepoints)
Input array to be used as database for the similarity search
- yoptional
Not used.
- Returns:
- self
- Raises:
- TypeError
If the input X array is not 3D raise an error.
- classmethod get_class_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class (only class tags).
- Parameters:
- tag_namestr
Name of tag value.
- raise_errorbool, default=True
Whether a ValueError is raised when the tag is not found.
- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in cls. If not found, returns an error ifraise_erroris True, otherwise it returnstag_value_default.
- Raises:
- ValueError
if
raise_erroris True andtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> DummyClassifier.get_class_tag("capability:multivariate") True
- classmethod get_class_tags()[source]¶
Get class tags from estimator class and all its parent classes.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance. These are not overridden by dynamic tags set byset_tagsor class__init__calls.
- get_fitted_params(deep=True)[source]¶
Get fitted parameters.
- State required:
Requires state to be “fitted”.
- Parameters:
- deepbool, default=True
If True, will return the fitted parameters for this estimator and contained subobjects that are estimators.
- Returns:
- fitted_paramsdict
Fitted parameter names mapped to their values.
- get_params(deep=True)[source]¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_tag(tag_name, raise_error=True, tag_value_default=None)[source]¶
Get tag value from estimator class.
Includes dynamic and overridden tags.
- Parameters:
- tag_namestr
Name of tag to be retrieved.
- raise_errorbool, default=True
Whether a ValueError is raised when the tag is not found.
- tag_value_defaultany type, default=None
Default/fallback value if tag is not found and error is not raised.
- Returns:
- tag_value
Value of the
tag_nametag in self. If not found, returns an error ifraise_erroris True, otherwise it returnstag_value_default.
- Raises:
- ValueError
if raise_error is
Trueandtag_nameis not inself.get_tags().keys()
Examples
>>> from aeon.classification import DummyClassifier >>> d = DummyClassifier() >>> d.get_tag("capability:multivariate") True
- get_tags()[source]¶
Get tags from estimator.
Includes dynamic and overridden tags.
- Returns:
- collected_tagsdict
Dictionary of tag name and tag value pairs. Collected from
_tagsclass attribute via nested inheritance and then any overridden and new tags from__init__orset_tags.
- reset(keep=None)[source]¶
Reset the object to a clean post-init state.
After a
self.reset()call, self is equal or similar in value totype(self)(**self.get_params(deep=False)), assuming no other attributes were kept usingkeep.- Detailed behaviour:
- removes any object attributes, except:
hyper-parameters (arguments of
__init__) object attributes containing double-underscores, i.e., the string “__”
runs
__init__with current values of hyperparameters (result ofget_params)- Not affected by the reset are:
object attributes containing double-underscores class and object methods, class attributes any attributes specified in the
keepargument
- Parameters:
- keepNone, str, or list of str, default=None
If None, all attributes are removed except hyperparameters. If str, only the attribute with this name is kept. If list of str, only the attributes with these names are kept.
- Returns:
- selfobject
Reference to self.
- set_params(**params)[source]¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.