catlearn.active_learning package¶

Submodules¶

catlearn.active_learning.acquisition_functions module¶

GP acquisition functions.

catlearn.active_learning.acquisition_functions.EI(y_best, predictions, uncertainty, objective='max')¶

Return expected improvement acq. function.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.PI(y_best, predictions, uncertainty, objective)¶

Probability of improvement acq. function.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.UCB(predictions, uncertainty, objective='max', kappa=1.5)¶

Upper-confidence bound acq. function.

Parameters:	predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions. kappa (float) – Constant that controls the explotation/exploration ratio in UCB.

catlearn.active_learning.acquisition_functions.classify(classifier, train_atoms, test_atoms, targets, predictions, uncertainty, train_features=None, test_features=None, objective='max', k_means=3, kappa=1.5, metrics=['optimistic', 'UCB', 'EI', 'PI'])¶

Classify ranked predictions based on acquisition function.

Parameters:	classifier (func) – User defined function to classify an atoms object. train_atoms (list) – List of atoms objects from training data upon which to base classification. test_atoms (list) – List of atoms objects from test data upon which to base classification. targets (list) – List of known target values. predictions (list) – List of predictions from the GP. uncertainty (list) – List of variance on the GP predictions. train_features (array) – Feature matrix for the training data. test_features (array) – Feature matrix for the test data. k_means (int) – Number of cluster to generate with clustering. kappa (float) – Constant that controls the explotation/exploration ratio in UCB. metrics (list) – list of strings. Accepted values are ‘cdf’, ‘UCB’, ‘EI’, ‘PI’, ‘optimistic’ and ‘pdf’.
Returns:	res – A dictionary of lists containg the fitness of each test point for the different acquisition functions.
Return type:	dict

catlearn.active_learning.acquisition_functions.cluster(train_features, targets, test_features, predictions, k_means=3)¶

Penalize test points that are too clustered.

Parameters:	train_features (array) – Feature matrix for the training data. targets (list) – Training targets. test_features (array) – Feature matrix for the test data. predictions (list) – Predicted means. k_means (int) – Number of clusters.

catlearn.active_learning.acquisition_functions.optimistic(y_best, predictions, uncertainty)¶

Find predictions that will optimistically lead to progress.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.optimistic_proximity(y_best, predictions, uncertainty)¶

Return uncertainties minus distances to y_best.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.probability_density(y_best, predictions, uncertainty)¶

Return probability densities at y_best.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.proximity(y_best, predictions, uncertainty=None)¶

Return negative distances to y_best.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.random_acquisition(y_best, predictions, uncertainty=None)¶

Return random numbers for control experiments.

Parameters:	y_best (float) – Condition predictions (list) – Predicted means. uncertainty (list) – Uncertainties associated with the predictions.

catlearn.active_learning.acquisition_functions.rank(targets, predictions, uncertainty, train_features=None, test_features=None, objective='max', k_means=3, kappa=1.5, metrics=['optimistic', 'UCB', 'EI', 'PI'])¶

Rank predictions based on acquisition function.

Parameters:	targets (list) – List of known target values. predictions (list) – List of predictions from the GP. uncertainty (list) – List of variance on the GP predictions. train_features (array) – Feature matrix for the training data. test_features (array) – Feature matrix for the test data. k_means (int) – Number of cluster to generate with clustering. kappa (float) – Constant that controls the explotation/exploration ratio in UCB. metrics (list) – list of strings. Accepted values are ‘cdf’, ‘UCB’, ‘EI’, ‘PI’, ‘optimistic’ and ‘pdf’.
Returns:	res – A dictionary of lists containg the fitness of each test point for the different acquisition functions.
Return type:	dict

catlearn.active_learning.algorithm module¶

Class to automate building a surrogate model.

class catlearn.active_learning.algorithm.ActiveLearning(surrogate_model, train_data, target)¶

Bases: object

Active learning class, intended for screening or optimizing in a predefined and finite search space.

acquire(unlabeled_data, batch_size=1)¶

Return indices of datapoints to acquire, from a predefined, finite search space.

Parameters:

unlabeled_data (array) – Data matrix representing an unlabeled search space.
initial_subset (list) – Row indices of data to train on in the first iteration.
batch_size (int) – Number of training points to acquire (move from test to training) in every iteration.

Returns:

to_acquire (list) – Row indices of unlabeled data to acquire.
score – User defined output from predict.

ensemble_test(size, initial_subset=None, batch_size=1, n_max=None, seed_list=None, nprocs=None)¶

Return a 3d array of test results for a surrogate model. The third dimension expands the ensemble of tests.

Parameters:	size (int) – How many tests to run. initial_subset (list) – Row indices of data to train on in the first iteration. batch_size (int) – Number of training points to acquire (move from test to training) in every iteration. n_max (int) – Max number of training points to test. seed_list (list) – List of integer seeds for shuffling training data. nprocs (int) – Number of processors for parallelization
Returns:	ensemble – size by iterations by number of metrics array of test results.
Return type:	array

test_acquisition(initial_subset=None, batch_size=1, n_max=None, seed=None)¶

Return an array of test results for a surrogate model.

Parameters:	initial_subset (list) – Row indices of data to train on in the first iteration. batch_size (int) – Number of training points to acquire (move from test to training) in every iteration. n_max (int) – Max number of training points to test.

catlearn.active_learning package¶

Submodules¶

catlearn.active_learning.acquisition_functions module¶

catlearn.active_learning.algorithm module¶

Module contents¶