Benchmarking

The aeon.benchmarking module contains tools for comparing and evaluating time series models, loading stored results, and calculating performance metrics for a variety of tasks.

Results loading

Results loaders and loading utilities for aeon (and other) estimators.

estimator_alias(name)

Return the standard name for possible aliased estimator.

get_available_estimators([task, as_list])

Get a DataFrame of estimators avialable for a specific learning task.

get_estimator_results(estimators[, ...])

Look for results for given estimators for a list of datasets.

get_estimator_results_as_array(estimators[, ...])

Look for results for given estimators for a list of datasets.

Published results

Results loaders for specific publications.

load_classification_bake_off_2017_results([...])

Fetch all the results of the 2017 univariate TSC bake off.

load_classification_bake_off_2021_results([...])

Pull down all the results of the 2021 multivariate bake off.

load_classification_bake_off_2023_results([...])

Pull down all the results of the 2023 univariate bake off.

Resampling

Functions for resampling time series data.

resample_data(X_train, y_train, X_test, y_test)

Resample data without replacement using a random state.

resample_data_indices(y_train, y_test[, ...])

Return data resample indices without replacement using a random state.

stratified_resample_data(X_train, y_train, ...)

Stratified resample data without replacement using a random state.

stratified_resample_data_indices(y_train, y_test)

Return stratified data resample indices without replacement using a random state.

Performance metrics

Performance metrics used for evaluating aeon estimators.

Anomaly Detection

range_precision(y_true, y_pred[, alpha, ...])

Compute the range-based precision metric.

range_recall(y_true, y_pred[, alpha, ...])

Compute the range-based recall metric.

range_f_score(y_true, y_pred[, beta, ...])

Compute the F-score using the range-based recall and precision metrics.

roc_auc_score(y_true, y_score)

Compute the ROC AUC score.

pr_auc_score(y_true, y_score)

Compute the precision-recall AUC score.

rp_rr_auc_score(y_true, y_score[, ...])

Compute the AUC-score of the range-based precision-recall curve.

f_score_at_k_points(y_true, y_score[, k])

Compute the F-score at k based on single points.

f_score_at_k_ranges(y_true, y_score[, k])

Compute the range-based F-score at k based on anomaly ranges.

range_pr_roc_auc_support(y_true, y_score[, ...])

Compute the range-based PR and ROC AUC.

range_roc_auc_score(y_true, y_score[, ...])

Compute the range-based area under the ROC curve.

range_pr_auc_score(y_true, y_score[, ...])

Compute the area under the range-based PR curve.

range_pr_vus_score(y_true, y_score[, ...])

Compute the range-based PR VUS score.

range_roc_vus_score(y_true, y_score[, ...])

Compute the range-based ROC VUS score.

Anomaly detection thresholding

percentile_threshold(y_score, percentile)

Calculate a threshold based on a percentile of the anomaly scores.

sigma_threshold(y_score[, factor])

Calculate a threshold based on the standard deviation of the anomaly scores.

top_k_points_threshold(y_true, y_score[, k])

Calculate a threshold such that at least k anomalous points are found.

top_k_ranges_threshold(y_true, y_score[, k])

Calculate a threshold such that at least k anomalies are found.

Clustering

clustering_accuracy_score(y_true, y_pred)

Calculate clustering accuracy.

Segmentation

count_error(true_change_points, ...)

Error counting the difference in the number of change points.

hausdorff_error(true_change_points, ...[, ...])

Compute the Hausdorff distance between two sets of change points.

prediction_ratio(true_change_points, ...)

Prediction ratio is the ratio of number of predicted to true change points.

Stats

check_friedman(ranks)

Check whether Friedman test is significant.

nemenyi_test(ordered_avg_ranks, n_datasets, ...)

Find cliques using post hoc Nemenyi test.

wilcoxon_test(results, labels[, lower_better])

Perform Wilcoxon test.