ggml_ot.test#
- ggml_ot.test(dataset, ground_metric=None, *args, knn_k=5, **kwargs)[source]#
Tests ground metric on a given dataset.
This function evaluates a provided ground metric on n_splits stratified train-test splits using k-NN classification and hierarchical clustering. For each split, the ground metric is evaluated on the test set using a k-NN classification and hierarchical clustering.
Classification accuracy and clustering metrics are summarized in a table, and visualizations of the results are plotted.
- Parameters:
- dataset
TripletDataset|AnnData_TripletDataset Dataset to perform cross-validation on.
See also
The documentation for the provided interfaces to
AnnDataandnumpy arrays.- ground_metric
UnionType[ndarray,str,None] (default:None) Ground metric to use for testing. If None (default), tries to use dataset.map_A. You can also explicitly provide a ground metric trained with
ggml_ot.train()as a numpy array.To use a fixed metric provide the metric name as a string (e.g. “euclidean”,”cosine”), see [scipy.distance.cdist](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) for supported metrics.
Warning
If no ground_metric is provided and dataset has not been trained, this function will issue a warning and train a ground metric for each split. If you want to train and test ground metrics, you are encouraged to directly use
ggml_ot.train_test().- knn_k
int(default:5) Number of neighbors used for benchmark k-NN classification.
- args
Additional arguments passed to
ggml_ot.train_test(). Internally, this function callsggml_ot.train_test()with the provided ground metric and skips training.- kwargs
Additional arguments passed to
ggml_ot.train_test(). Internally, this function callsggml_ot.train_test()with the provided ground metric and skips training.
- dataset
- Return type:
DataFrame- Returns:
pd.DataFrame DataFrame summarizing the mean and standard deviation of the evaluation metrics across test splits.
Note
While this function can be used to train a ground metric, it only does so for evaluation purposes and does not return the trained metric. For training ground metrics for later use, please use
ggml_ot.train()orggml_ot.train_test().