model
¶
Package for building and training DeepCpG modules.
model.utils
¶
Functions for building, training, and loading models.
-
class
deepcpg.models.utils.
DataReader
(output_names=None, use_dna=True, dna_wlen=None, replicate_names=None, cpg_wlen=None, cpg_max_dist=25000, encode_replicates=False)[source]¶ Read data from dcpg_data.py output files.
Generator to read data batches from dcpg_data.py output files. Reads data using
hdf.reader()
and pre-processes data.Parameters: output_names: list
Names of outputs to be read.
use_dna: bool
If True, read DNA sequence windows.
dna_wlen: int
Maximum length of DNA sequence windows.
replicate_names: list
Name of cells (profiles) whose neighboring CpG sites are read.
cpg_wlen: int
Maximum number of neighboring CpG sites.
cpg_max_dist: int
Value to threshold the distance of neighboring CpG sites.
encode_replicates: bool
If True, encode replicated names in key of returned dict. This option is deprecated and will be removed in the future.
Returns: tuple
dict (inputs, outputs, weights), where inputs, outputs, weights is a dict of model inputs, outputs, and output weights. outputs and weights are not returned if output_names is undefined.
-
class
deepcpg.models.utils.
Model
(dropout=0.0, l1_decay=0.0, l2_decay=0.0, init='glorot_uniform')[source]¶ Abstract model call.
Abstract class of DNA, CpG, and Joint models.
Parameters: dropout: float
Dropout rate.
l1_decay: float
L1 weight decay.
l2_decay: float
L2 weight decay.
init: str
Name of Keras initialization.
-
class
deepcpg.models.utils.
ScaledSigmoid
(scaling=1.0, **kwargs)[source]¶ Scaled sigmoid activation function.
Scales the maximum of the sigmoid function from one to the provided value.
Parameters: scaling: float
Maximum of sigmoid function.
-
deepcpg.models.utils.
add_output_layers
(stem, output_names, init='glorot_uniform')[source]¶ Add and return outputs to a given layer.
Adds output layer for each output in output_names to layer stem.
Parameters: stem: Keras layer
Keras layer to which output layers are added.
output_names: list
List of output names.
Returns: list
Output layers added to stem.
-
deepcpg.models.utils.
copy_weights
(src_model, dst_model, must_exist=True)[source]¶ Copy weights from src_model to dst_model.
Parameters: src_model
Keras source model.
dst_model
Keras destination model.
must_exist: bool
If True, raises ValueError if a layer in dst_model does not exist in src_model.
Returns: list
Names of layers that were copied.
-
deepcpg.models.utils.
data_reader_from_model
(model, outputs=True, replicate_names=None)[source]¶ Return
DataReader
from model.Builds a
DataReader
for reading data for model.Parameters: model: :class:`Model`.
outputs: bool
If True, return output labels.
replicate_names: list
Name of input cells of model.
Returns: Instance of
DataReader
.
-
deepcpg.models.utils.
decode_replicate_names
(replicate_names)[source]¶ Decode string of replicate names and return names as list.
Note
Deprecated This function is used to support legacy models and will be removed in the future.
-
deepcpg.models.utils.
encode_replicate_names
(replicate_names)[source]¶ Encode list of replicate names as single string.
Note
Deprecated This function is used to support legacy models and will be removed in the future.
-
deepcpg.models.utils.
evaluate_generator
(model, generator, return_data=False, *args, **kwargs)[source]¶ Evaluate model on generator.
Uses predict_generator to obtain predictions and ev.evaluate to evaluate predictions.
Parameters: model
Model to be evaluated.
generator
Data generator.
return_rate: bool
Return predictions and labels.
*args: list
Unnamed arguments passed to predict_generator.
*kwargs: dict
Named arguments passed to predict_generator.
Returns: If return_data=False, pandas data frame with performance metrics. If
return_data=True, tuple (perf, data) with performance metrics perf
and data.
-
deepcpg.models.utils.
get_first_conv_layer
(layers, get_act=False)[source]¶ Return the first convolutional layers in a stack of layer.
Parameters: layers: list
List of Keras layers.
get_act: bool
Return the activation layer after the convolutional weight layer.
Returns: Keras layer
Convolutional layer or tuple of convolutional layer and activation layer if get_act=True.
-
deepcpg.models.utils.
get_objectives
(output_names)[source]¶ Return training objectives for a list of output names.
Returns: dict
dict with output_names as keys and the name of the assigned Keras objective as values.
-
deepcpg.models.utils.
get_sample_weights
(y, class_weights=None)[source]¶ Compute sample weights for model training.
Computes sample weights given a vector of output labels y. Sets weights of samples without label (CPG_NAN) to zero.
Parameters: y: :class:`numpy.ndarray`
1d numpy array of output labels.
class_weights: dict
Weight of output classes, e.g. methylation states.
Returns: Sample weights of size y.
-
deepcpg.models.utils.
load_model
(model_files, custom_objects={'ScaledSigmoid': <class 'deepcpg.models.utils.ScaledSigmoid'>}, log=None)[source]¶ Load Keras model from a list of model files.
Loads Keras model from list of filenames, e.g. from search_model_files. model_files can be single HDF5 file, or JSON and weights file.
Parameters: model_file: list
Input model file names.
custom_object: dict
Custom objects for loading models that were trained with custom objects, e.g. ScaledSigmoid.
Returns: Keras model.
-
deepcpg.models.utils.
predict_generator
(model, generator, nb_sample=None)[source]¶ Predict model outputs using generator.
Calls model.predict for at most nb_sample samples from generator.
Parameters: model: Keras model
Model to be evaluated.
generator: generator
Data generator.
nb_sample: int
Maximum number of samples.
Returns: list
list [inputs, outputs, predictions].
-
deepcpg.models.utils.
save_model
(model, model_file, weights_file=None)[source]¶ Save Keras model to file.
If model_file ends with ‘.h5’, saves model description and model weights in HDF5 file. Otherwise, saves JSON model description in model_file and model weights in weights_file if provided.
Parameters: model
Keras model.
model_file: str
Output file.
weights_file: str
Weights file.
model.cpg
¶
CpG models.
Provides models trained with observed neighboring methylation states of multiple cells.
-
class
deepcpg.models.cpg.
FcAvg
(*args, **kwargs)[source]¶ Fully-connected layer followed by global average layer.
Parameters: 54,000 Specification: fc[512]_gap
-
class
deepcpg.models.cpg.
RnnL1
(act_replicate='relu', *args, **kwargs)[source]¶ Bidirectional GRU with one layer.
Parameters: 810,000 Specification: fc[256]_bgru[256]_do
model.dna
¶
DNA models.
Provides models trained with DNA sequence windows.
-
class
deepcpg.models.dna.
CnnL1h128
(nb_hidden=128, *args, **kwargs)[source]¶ CNN with one convolutional and one fully-connected layer with 128 units.
Parameters: 4,100,000 Specification: conv[128@11]_mp[4]_fc[128]_do
-
class
deepcpg.models.dna.
CnnL1h256
(*args, **kwargs)[source]¶ CNN with one convolutional and one fully-connected layer with 256 units.
Parameters: 8,100,000 Specification: conv[128@11]_mp[4]_fc[256]_do
-
class
deepcpg.models.dna.
CnnL2h128
(nb_hidden=128, *args, **kwargs)[source]¶ CNN with two convolutional and one fully-connected layer with 128 units.
Parameters: 4,100,000 Specification: conv[128@11]_mp[4]_conv[256@3]_mp[2]_fc[128]_do
-
class
deepcpg.models.dna.
CnnL2h256
(*args, **kwargs)[source]¶ CNN with two convolutional and one fully-connected layer with 256 units.
Parameters: 8,100,000 Specification: conv[128@11]_mp[4]_conv[256@3]_mp[2]_fc[256]_do
-
class
deepcpg.models.dna.
CnnL3h128
(nb_hidden=128, *args, **kwargs)[source]¶ CNN with three convolutional and one fully-connected layer with 128 units.
Parameters: 4,400,000 Specification: conv[128@11]_mp[4]_conv[256@3]_mp[2]_conv[512@3]_mp[2]_ fc[128]_do
-
class
deepcpg.models.dna.
CnnL3h256
(*args, **kwargs)[source]¶ CNN with three convolutional and one fully-connected layer with 256 units.
Parameters: 8,300,000 Specification: conv[128@11]_mp[4]_conv[256@3]_mp[2]_conv[512@3]_mp[2]_ fc[256]_do
-
class
deepcpg.models.dna.
CnnRnn01
(*args, **kwargs)[source]¶ Convolutional-recurrent model.
Convolutional-recurrent model with two convolutional layers followed by a bidirectional GRU layer.
Parameters: 1,100,000 Specification: conv[128@11]_pool[4]_conv[256@7]_pool[4]_bgru[256]_do
-
class
deepcpg.models.dna.
ResAtrous01
(*args, **kwargs)[source]¶ Residual network with Atrous (dilated) convolutional layers.
Residual network with Atrous (dilated) convolutional layer in bottleneck units. Atrous convolutional layers allow to increase the receptive field and hence better model long-range dependencies.
Parameters: 2,000,000 Specification: conv[128@11]_mp[2]_resa[3x128|3x256|3x512|1x1024]_gap_do
He et al., ‘Identity Mappings in Deep Residual Networks.’ Yu and Koltun, ‘Multi-Scale Context Aggregation by Dilated Convolutions.’
-
class
deepcpg.models.dna.
ResConv01
(*args, **kwargs)[source]¶ Residual network with two convolutional layers in each residual unit.
Parameters: 2,800,000 Specification: conv[128@11]_mp[2]_resc[2x128|1x256|1x256|1x512]_gap_do
He et al., ‘Identity Mappings in Deep Residual Networks.’
-
class
deepcpg.models.dna.
ResNet01
(*args, **kwargs)[source]¶ Residual network with bottleneck residual units.
Parameters: 1,700,000 Specification: conv[128@11]_mp[2]_resb[2x128|2x256|2x512|1x1024]_gap_do
He et al., ‘Identity Mappings in Deep Residual Networks.’