Train hierarchical classifier
Created on Wed Oct 23 11:37:16 2019
@author: Lieke
- scHPL.train.train_tree(data, labels, tree: TreeNode, classifier: Literal['knn', 'svm', 'svm_occ'] = 'knn', dimred: bool = False, useRE: bool = True, FN: float = 0.5, n_neighbors: int = 50, dynamic_neighbors: bool = True, distkNN: int = 99, gpu: int | None = None)[source]
Train a hierarchical classifier.
- Parameters:
data (array_like) – Training data (cells x genes)
labels (array_like) – Cell type labels of the training data
tree (TreeNode) – Classification tree to train (can be build using utils.create_tree())
classifier (String = 'knn') – Classifier to use (either ‘svm’, ‘svm_occ’ or ‘knn’).
dimred (Boolean = False) – If ‘True’, PCA is applied before training the classifier.
useRE (Boolean = True) – If ‘True’, cells are also rejected based on the reconstruction error.
FN (Float = 0.5) – Percentage of false negatives allowed when determining the threshold for the reconstruction error.
n_neighbors (int = 50) – Number of neighbors for the kNN classifier (only used when classifier=’knn’).
dynamic_neighbors (bool = True) – Number of neighbors for the kNN classifier can change when a node contains a very small cell population. k is set to min(n_neighbors, smallest-cell-population)
distkNN (int = 99) – Used to determine the threshold for the maximum distance between a cell and it’s closest neighbor of the training set. Threshold is set to the distkNN’s percentile of distances within the training set
gpu (int | None = None) – GPU index to use for the Faiss library (only used when classifier=’knn’)
- Return type:
Trained classification tree