biofefi.services package¶

Submodules¶

biofefi.services.configuration module¶

biofefi.services.configuration.load_data_preprocessing_options(path: Path) → PreprocessingOptions¶

Load data preprocessing options from the given path. The path will be to a json file containing the options.

Parameters:: path (Path) – The path the json file containing the options.
Returns:: The data preprocessing options.
Return type:: PreprocessingOptions

biofefi.services.configuration.load_execution_options(path: Path) → ExecutionOptions¶

Load experiment execution options from the given path. The path will be to a json file containing the options.

Parameters:: path (Path) – The path the json file containing the options.
Returns:: The execution options.
Return type:: ExecutionOptions

biofefi.services.configuration.load_fi_options(path: Path) → FeatureImportanceOptions | None¶

Load feature importance options.

Parameters:: path (Path) – The path to the feature importance options file.
Returns:: The feature importance options.
Return type:: FeatureImportanceOptions | None

biofefi.services.configuration.load_plot_options(path: Path) → PlottingOptions¶

Load plotting options from the given path. The path will be to a json file containing the plot options.

Parameters:: path (Path) – The path the json file containing the options.
Returns:: The plotting options.
Return type:: PlottingOptions

biofefi.services.configuration.save_options(path: Path, options: T)¶

Save options to a json file at the specified path.

Parameters:

path (Path) – The path to the json file.
options (T) – The options to save.

biofefi.services.experiments module¶

biofefi.services.experiments.create_experiment(save_dir: Path, plotting_options: PlottingOptions, execution_options: ExecutionOptions)¶

Create an experiment on disk with it’s global plotting options saved as a json file.

Parameters:

save_dir (Path) – The path to where the experiment will be created.
plotting_options (PlottingOptions) – The plotting options to save.

biofefi.services.experiments.delete_previous_fi_results(experiment_path: Path)¶

Delete previous feature importance results.

Parameters:: experiment_path (Path) – The path to the experiment.

biofefi.services.experiments.find_previous_fi_results(experiment_path: Path) → bool¶

Find previous feature importance results.

Parameters:: experiment_path (Path) – The path to the experiment.
Returns:: whether previous experiments exist or not.
Return type:: bool

biofefi.services.experiments.get_experiments(base_dir: Path | None = None) → list[str]¶

Get the list of experiments in the BioFEFI experiment directory.

If base_dir is not specified, the default from biofefi_experiments_base_dir is used

Parameters:

base_dir (Path | None, optional) – Specify a base directory for experiments.
None. (Defaults to)

Returns:

The list of experiments.

Return type:

list[str]

biofefi.services.logs module¶

biofefi.services.logs.get_logs(log_dir: Path) → str¶

Get the latest log file for the latest run to display.

Parameters:: log_dir (Path) – The directory to search for the latest logs.
Raises:: NotADirectoryError – log_dir does not point to a directory.
Returns:: The text of the latest log file.
Return type:: str

biofefi.services.metrics module¶

biofefi.services.metrics.get_metrics(problem_type: ProblemTypes, logger: object = None) → dict¶

Get the metrics functions for a given problem type.

For classification: - Accuracy - F1 - Precision - Recall - ROC AUC

For Regression - R2 - MAE - RMSE

Parameters:

problem_type (ProblemTypes) – Where the problem is classification or regression.
logger (object, optional) – The logger. Defaults to None.

Raises:

ValueError – When you give an incorrect problem type.

Returns:

A dict of score names and functions.

Return type:

dict

biofefi.services.ml_models module¶

biofefi.services.ml_models.get_model(model_type: type, model_params: dict | None = None) → MlModel¶

Produce a machine learning model with the provided parameters, configured for the given problem type.

If the model is to be used in a grid search, specify model_params=None.

Parameters:

model_type (type) – The Python type (constructor) of the model to instantiate.
model_params (dict, optional) – The parameters to pass to the model constructor. Defaults to None.

Returns:

A new instance of the requested machine learning model.

Return type:

MlModel

biofefi.services.ml_models.get_model_type(model_type: str, problem_type: ProblemTypes) → type¶

Fetch the appropriate type for a given model name based on the problem type.

Parameters:

model_type (dict) – The kind of model.
problem_type (ProblemTypes) – Type of problem (classification or regression).

Raises:

ValueError – If a model type is not recognised or unsupported.

Returns:

The constructor for a machine learning model class.

Return type:

type

biofefi.services.ml_models.load_models(path: Path) → dict[str, list]¶

Load pre-trained machine learning models.

Parameters:: path (Path) – The path to the directory where the models are saved.
Returns:: The pre-trained models.
Return type:: dict[str, list]

biofefi.services.ml_models.load_models_to_explain(path: Path, model_names: list) → dict[str, list]¶

Load pre-trained machine learning models.

Parameters:

path (Path) – The path to the directory where the models are saved.
model_names (str) – The name of the models to explain.

Returns:

The pre-trained models.

Return type:

dict[str, list]

biofefi.services.ml_models.models_exist(path: Path) → bool¶

biofefi.services.ml_models.save_model(model, path: Path)¶

Save a machine learning model to the given file path.

Parameters:

model (_type_) – The model to save. Must be picklable.
path (Path) – The file path to save the model.

biofefi.services.ml_models.save_models_metrics(metrics: dict, path: Path)¶

Save the statistical metrics of the models to the given file path.

Parameters:

metrics (dict) – The metrics to save.
path (Path) – The file path to save the metrics.

biofefi.services.plotting module¶

biofefi.services.plotting.plot_auc_roc(y_classes_labels: ndarray, y_score_probs: ndarray, set_name: str, model_name: str, directory: Path, plot_opts: PlottingOptions | None = None)¶: Plot the ROC curve for a multi-class classification model. :param y_classes_labels: The true labels of the classes. :type y_classes_labels: numpy.ndarray :param y_score_probs: The predicted probabilities of the classes. :type y_score_probs: numpy.ndarray :param set_name: The name of the set (train or test). :type set_name: string :param model_name: The name of the model. :type model_name: string :param directory: The directory path to save the plot. :type directory: Path :param Returns: :param None:

biofefi.services.plotting.plot_confusion_matrix(estimator, X, y, set_name: str, model_name: str, directory: Path, plot_opts: PlottingOptions | None = None)¶

Plot the confusion matrix for a multi-class or binary classification model.

Parameters:

estimator – The trained model.
X – The features.
y – The true labels.
set_name – The name of the set (train or test).
model_name – The name of the model.
directory – The directory path to save the plot.
plot_opts – Options for styling the plot. Defaults to None.

Returns:

None

biofefi.services.plotting.plot_global_shap_importance(shap_values: DataFrame, plot_opts: PlottingOptions, num_features_to_plot: int, title: str) → Figure¶

Produce a bar chart of global SHAP values.

Parameters:

shap_values (pd.DataFrame) – The DataFrame containing the global SHAP values.
plot_opts (PlottingOptions) – The plotting options.
num_features_to_plot (int) – The number of top features to plot.
title (str) – The plot title.

Returns:

The bar chart of global SHAP values.

Return type:

Figure

biofefi.services.plotting.plot_lime_importance(df: DataFrame, plot_opts: PlottingOptions, num_features_to_plot: int, title: str) → Figure¶

Plot LIME importance.

Parameters:

df (pd.DataFrame) – The LIME data to plot
plot_opts (PlottingOptions) – The plotting options.
num_features_to_plot (int) – The top number of features to plot.
title (str) – The title of the plot.

Returns:

The LIME plot.

Return type:

Figure

biofefi.services.plotting.plot_local_shap_importance(shap_values: Explainer, plot_opts: PlottingOptions, num_features_to_plot: int, title: str) → Figure¶

Plot a beeswarm plot of the local SHAP values.

Parameters:

shap_values (shap.Explainer) – The SHAP explainer to produce the plot from.
plot_opts (PlottingOptions) – The plotting options.
num_features_to_plot (int) – The number of top features to plot.
title (str) – The plot title.

Returns:

The beeswarm plot of local SHAP values.

Return type:

Figure

biofefi.services.plotting.plot_scatter(y, yp, r2: float, set_name: str, dependent_variable: str, model_name: str, plot_opts: PlottingOptions | None = None)¶

_summary_

Parameters:

y (_type_) – True y values.
yp (_type_) – Predicted y values.
r2 (float) – R-squared between y`and `yp.
set_name (str) – “Train” or “Test”.
dependent_variable (str) – The name of the dependent variable.
model_name (str) – Name of the model.
plot_opts (PlottingOptions | None, optional)
None. (Options for styling the plot. Defaults to)

biofefi.services.preprocessing module¶

biofefi.services.preprocessing.find_non_numeric_columns(data: DataFrame | Series) → List[str]¶

Find non-numeric columns in a DataFrame or check if a Series contains non-numeric values.

Parameters:

data (Union[pd.DataFrame, pd.Series]) – The DataFrame or Series to check.

Returns:

If data is a DataFrame, returns a list of non-numeric column names.: If data is a Series, returns [“Series”] if it contains non-numeric values, else an empty list.

Return type:

List[str]

biofefi.services.preprocessing.normalise_independent_variables(normalisation_method: str, X)¶

Normalise the independent variables based on the selected method.

Parameters:

normalisation_method (str) – The normalisation method to use.
X (pd.DataFrame) – The independent variables to normalise.

Returns:

The normalised independent variables.

Return type:

pd.DataFrame

biofefi.services.preprocessing.run_feature_selection(preprocessing_opts: PreprocessingOptions, data: DataFrame) → DataFrame¶

Run feature selection on the data based on the selected methods.

Parameters:

feature_selection_methods (dict) – A dictionary of the feature selection methods to use.
data (pd.DataFrame) – The data to perform feature selection on.

Returns:

The processed data.

Return type:

pd.DataFrame

biofefi.services.preprocessing.run_preprocessing(data: DataFrame, experiment_path: Path, config: PreprocessingOptions) → DataFrame¶

biofefi.services.preprocessing.transform_dependent_variable(transformation_y_method: str, y)¶

Transform the dependent variable based on the selected method.

Parameters:

transformation_y_method (str) – The transformation method to use.
y (pd.Series) – The dependent variable to transform.

Returns:

The transformed dependent variable.

Return type:

pd.Series

biofefi.services.weights_init module¶

biofefi.services.weights_init.kaiming_init(m: Module, nonlinearity: str = 'relu') → None¶

Initializes the weights of Linear layers using Kaiming initialization.

Parameters:

m (torch.nn.Module) – The module to initialize.
nonlinearity (str) – The nonlinearity used in the network
(e.g.
'relu'
"relu". ('leaky_relu'). Defaults to)

Returns:

None

biofefi.services.weights_init.normal_init(m: Module, mean: float = 0.0, std: float = 0.02) → None¶

Initializes the weights of Linear layers using a normal distribution.

Parameters:

m (torch.nn.Module) – The module to initialize.
mean (float) – The mean of the normal distribution. Defaults to 0.0.
std (float) – The standard deviation of the normal distribution.
0.02. (Defaults to)

Returns:

None

biofefi.services.weights_init.xavier_init(m: Module) → None¶

Initializes the weights of Linear layers using Xavier initialization.

Parameters:: m (torch.nn.Module) – The module to initialize.
Returns:: None

biofefi.services package¶

Subpackages¶

Submodules¶

biofefi.services.configuration module¶

biofefi.services.experiments module¶

biofefi.services.logs module¶

biofefi.services.metrics module¶

biofefi.services.ml_models module¶

biofefi.services.plotting module¶

biofefi.services.preprocessing module¶

biofefi.services.weights_init module¶

Module contents¶

BioFEFI

Navigation

Related Topics