get_estimator_results¶

get_estimator_results(estimators: str | list[str], datasets: list[str] | None = None, num_resamples: int | None = 1, task: str = 'classification', measure: str = 'accuracy', remove_dataset_modifiers: bool = False, path: str = 'http://timeseriesclassification.com/results/ReferenceResults')[source]¶

Look for results for given estimators for a list of datasets.

This function loads or pulls down a CSV of results, scans it for datasets and returns any results found as a dictionary. If a dataset is not present, it is ignored.

Parameters:

estimatorsstr ot list of str: Estimator name or list of estimator names to search for. See get_available_estimators, aeon.benchmarking.results_loading.NAME_ALIASES or the directory at path for valid options.
datasetslist of str or None, default=None: List of problem names to search for. If the dataset is not present in the results, it is ignored. If None, all datasets the estimator has results for is returned.
num_resamplesint or None, default=1: The number of data resamples to return scores for. The first resample is the default train/test split for the dataset. For 1, only the score for the default train/test split of the dataset is returned. For 2 or more, a np.ndarray of scores for all resamples up to num_resamples are returned. If None, the scores of all resamples are returned.
taskstr, default=”classification”: Should be one of aeon.benchmarking.results_loading.VALID_TASK_TYPES. i.e. “classification”, “clustering”, “regression”.
measurestr, default=”accuracy”: Should be one of aeon.benchmarking.results_loading.VALID_RESULT_MEASURES[task]. Dependent on the task, i.e. for classification, “accuracy”, “auroc”, “balacc”, and regression, “mse”, “mae”, “r2”.
remove_dataset_modifiers: bool, default=False: If True, will remove any dataset modifier (anything after the first underscore) from the dataset names in the loaded results file. i.e. a loaded result row for “Dataset_eq” will be converted to just “Dataset”.
pathstr, default=”https://timeseriesclassification.com/results/ReferenceResults/”: Path where to read results from. Defaults to timeseriesclassification.com.

Returns:

results: dict: Dictionary with estimator name keys containing another dictionary. Sub-dictionary consists of dataset name keys and contains of scores for each dataset.

Examples

>>> from aeon.benchmarking.results_loaders import get_estimator_results
>>> cls = ["HC2"]  
>>> data = ["Chinatown", "Adiac"]  
>>> get_estimator_results(estimators=cls, datasets=data) 
{'HC2': {'Chinatown': 0.9825072886297376, 'Adiac': 0.8107416879795396}}