ggml_ot.test

Contents

ggml_ot.test#

ggml_ot.test(dataset, ground_metric=None, *args, knn_k=5, **kwargs)[source]#

Tests ground metric on a given dataset.

This function evaluates a provided ground metric on n_splits stratified train-test splits using k-NN classification and hierarchical clustering. For each split, the ground metric is evaluated on the test set using a k-NN classification and hierarchical clustering.

Classification accuracy and clustering metrics are summarized in a table, and visualizations of the results are plotted.

Parameters:
dataset TripletDataset | AnnData_TripletDataset

Dataset to perform cross-validation on.

See also

The documentation for the provided interfaces to AnnData and numpy arrays.

ground_metric UnionType[ndarray, str, None] (default: None)

Ground metric to use for testing. If None (default), tries to use dataset.map_A. You can also explicitly provide a ground metric trained with ggml_ot.train() as a numpy array.

To use a fixed metric provide the metric name as a string (e.g. “euclidean”,”cosine”), see [scipy.distance.cdist](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) for supported metrics.

Warning

If no ground_metric is provided and dataset has not been trained, this function will issue a warning and train a ground metric for each split. If you want to train and test ground metrics, you are encouraged to directly use ggml_ot.train_test().

knn_k int (default: 5)

Number of neighbors used for benchmark k-NN classification.

args

Additional arguments passed to ggml_ot.train_test(). Internally, this function calls ggml_ot.train_test() with the provided ground metric and skips training.

kwargs

Additional arguments passed to ggml_ot.train_test(). Internally, this function calls ggml_ot.train_test() with the provided ground metric and skips training.

Return type:

DataFrame

Returns:

pd.DataFrame DataFrame summarizing the mean and standard deviation of the evaluation metrics across test splits.

Note

While this function can be used to train a ground metric, it only does so for evaluation purposes and does not return the trained metric. For training ground metrics for later use, please use ggml_ot.train() or ggml_ot.train_test().