`callbacks`¶

Keras callback classes used by dcpg_train.py.

class deepcpg.callbacks.PerformanceLogger(metrics=['loss', 'acc'], log_freq=0.1, precision=4, callbacks=[], verbose=<class 'bool'>, logger=<built-in function print>)[source]¶

Logs performance metrics during training.

Stores and prints performance metrics for each batch, epoch, and output.

Parameters:

metrics: list

Name of metrics to be logged.

log_freq: float

Logging frequency as the percentage of training samples per epoch.

precision: int

Floating point precision.

callbacks: list

List of functions with parameters epoch, epoch_logs, and val_epoch_logs that are called at the end of each epoch.

verbose: bool

If True, log performance metrics of individual outputs.

logger: function

Logging function.

class deepcpg.callbacks.TrainingStopper(max_time=None, stop_file=None, verbose=1, logger=<built-in function print>)[source]¶

Stop training after certain time or when file is detected.

Parameters:

max_time: int

Maximum training time in seconds.

stop_file: str

Name of stop file that triggers the end of training when existing.

verbose: bool

If True, log message when training is stopped.

`evaluation`¶

Functions for evaluating prediction performance.

deepcpg.evaluation.acc(y, z, round=True)[source]¶: Compute accuracy.

deepcpg.evaluation.auc(y, z, round=True)[source]¶: Compute area under the ROC curve.

deepcpg.evaluation.cat_acc(y, z)[source]¶: Compute categorical accuracy given one-hot matrices.

deepcpg.evaluation.cor(y, z)[source]¶: Compute Pearson’s correlation coefficient.

deepcpg.evaluation.evaluate(y, z, mask=-1, metrics=[<function auc>, <function acc>, <function tpr>, <function tnr>, <function f1>, <function mcc>])[source]¶

Compute multiple performance metrics.

Computes evaluation metrics using functions in metrics.

Parameters:

y: :class:`numpy.ndarray`

numpy.ndarray vector with labels.

z: :class:`numpy.ndarray`

numpy.ndarray vector with predictions.

mask: scalar

Value to mask unobserved labels in y.

metrics: list

List of evaluation functions to be used.

Returns:

Ordered dict

Ordered dict with name of evaluation functions as keys and evaluation metrics as values.

deepcpg.evaluation.evaluate_cat(y, z, metrics=[<function cat_acc>], binary_metrics=None)[source]¶

Compute multiple performance metrics for categorical outputs.

Computes evaluation metrics for categorical (one-hot encoded labels) using functions in metrics.

Parameters:

y: :class:`numpy.ndarray`

numpy.ndarray matrix with one-hot encoded labels.

z: :class:`numpy.ndarray`

numpy.ndarray matrix with class probabilities in rows.

metrics: list

List of evaluation functions to be used.

binary_metrics: list

List of binary evaluation metrics to be computed for each category, e.g. class, separately. Will be encoded as name_i in the output dictionary, where name is the name of the evaluation metrics and i the index of the category.

Returns:

Ordered dict

Ordered dict with name of evaluation functions as keys and evaluation metrics as values.

deepcpg.evaluation.evaluate_curve(outputs, preds, fun=<function roc_curve>, mask=-1, nb_point=None)[source]¶

Evaluate performance curves of multiple outputs.

Given the labels and predictions of multiple outputs, computes a performance a curve, e.g. ROC or PR curve, for each output.

Parameters:

outputs: dict

dict with the name of outputs as keys and a numpy.ndarray vector with labels as value.

preds: dict

dict with the name of outputs as keys and a numpy.ndarray vector with predictions as value.

fun: function

Function to compute the performance curves.

mask: scalar

Value to mask unobserved labels in y.

nb_point: int

Maximum number of points to curve to reduce memory.

Returns:

pandas.DataFrame

pandas.DataFrame with columns output, x, y, thr.

deepcpg.evaluation.evaluate_outputs(outputs, preds)[source]¶

Evaluate performance metrics of multiple outputs.

Given the labels and predictions of multiple outputs, chooses and computes performance metrics of each output depending on its name.

Parameters:

outputs: dict

dict with the name of outputs as keys and a numpy.ndarray vector with labels as value.

preds: dict

dict with the name of outputs as keys and a numpy.ndarray vector with predictions as value.

Returns:

pandas.DataFrame

pandas.DataFrame with columns metric, output, value.

deepcpg.evaluation.f1(y, z, round=True)[source]¶: Compute F1 score.

deepcpg.evaluation.get(name)[source]¶: Return object from module by its name.

deepcpg.evaluation.get_output_metrics(output_name)[source]¶: Return list of evaluation metrics for model output name.

deepcpg.evaluation.is_binary_output(output_name)[source]¶: Return True if output_name is binary.

deepcpg.evaluation.kendall(y, z, nb_sample=100000)[source]¶: Compute Kendall’s correlation coefficient.

deepcpg.evaluation.mad(y, z)[source]¶: Compute mean absolute deviation.

deepcpg.evaluation.mcc(y, z, round=True)[source]¶: Compute Matthew’s correlation coefficient.

deepcpg.evaluation.mse(y, z)[source]¶: Compute mean squared error.

deepcpg.evaluation.rmse(y, z)[source]¶: Compute root mean squared error.

deepcpg.evaluation.tnr(y, z, round=True)[source]¶: Compute true negative rate.

deepcpg.evaluation.tpr(y, z, round=True)[source]¶: Compute true positive rate.

deepcpg.evaluation.unstack_report(report)[source]¶

Unstack performance report.

Reshapes a pandas.DataFrame of evaluate_outputs() such that performance metrics are listed as columns.

Parameters:

report: :class:`pandas.DataFrame`

pandas.DataFrame from evaluate_outputs().

Returns:

pandas.DataFrame

pandas.DataFrame with performance metrics as columns.

`motifs`¶

Motif analysis.

deepcpg.motifs.get_report(filter_stats_file, tomtom_file, meme_motifs)[source]¶

Read and join filter_stats_file and tomtom_file.

Used by dcpg_filter_motifs.py to read and join output files.

Returns:

pandas.DataFrame

pandas.DataFrame with columns from Tomtom and statistic file.

deepcpg.motifs.read_meme_db(meme_db_file)[source]¶

Read MEME database as Pandas DataFrame.

Parameters:

meme_db_file: str

File name of MEME database.

Returns:

pandas.DataFrame

pandas.DataFrame with columns ‘id’, ‘protein’, ‘url’.

deepcpg.motifs.read_tomtom(path)[source]¶: Read Tomtom output file.

`utils`¶

General-purpose functions.

class deepcpg.utils.ProgressBar(nb_tot, logger=<built-in function print>, interval=0.1)[source]¶

Vertical progress bar.

Unlike the progressbar2 package, logs progress as multiple lines instead of single line, which enables printing to a file. Used, for example, in

Parameters:

nb_tot: int

Maximum value

logger: function

Function that takes a str and prints it.

interval: float

Logging frequency as fraction of one. For example, 0.1 logs every tenth value.

`callbacks`¶

`evaluation`¶

`motifs`¶

`utils`¶

Table Of Contents

Previous topic

Next topic

This Page

callbacks¶

evaluation¶

motifs¶

utils¶

`callbacks`¶

`evaluation`¶

`motifs`¶

`utils`¶