sleap.nn.architectures.pretrained_encoders#
Encoder-decoder backbones with pretrained encoder models.
This module enables encoder-decoder style architectures using a wide range of state-of-the-art architectures as the encoder, with a UNet-like upsampling stack that also takes advantage of skip connections from the encoder blocks. Additionally, the encoder models can use ImageNet-pretrained weights for initialization.
This module is made possible by the work by Pavel Yakubovskiy who graciously put together a library implementing common pretrained fully-convolutional architectures and their weights.
- For more info, see the source repositories:
License:
The MIT License
Copyright (c) 2018, Pavel Yakubovskiy
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- class sleap.nn.architectures.pretrained_encoders.UnetPretrainedEncoder(encoder: str = 'efficientnetb0', decoder_filters: Tuple[int] = (256, 256, 128, 128), pretrained: bool = True)[source]#
UNet with an (optionally) pretrained encoder model.
This backbone enables the use of a variety of popular neural network architectures for feature extraction in the backbone. These can be used with ImageNet-pretrained weights for initialization. The decoder (upsampling stack) receives skip connections from intermediate activations in the encoder blocks.
All of the encoder models have a maximum stride of 32 and the input does not need to be preprocessed in any special way. Grayscale images will be tiled to have 3 channels automatically.
See qubvel/classification_models for more information on the individual backbones.
- encoder#
Name of the model to use as the encoder. Valid encoder names are: -
"vgg16", "vgg19",
-"resnet18", "resnet34", "resnet50", "resnet101", "resnet152"
-"resnext50", "resnext101"
-"inceptionv3", "inceptionresnetv2"
-"densenet121", "densenet169", "densenet201"
-"seresnet18", "seresnet34", "seresnet50", "seresnet101", "seresnet152",
"seresnext50", "seresnext101", "senet154"
"mobilenet", "mobilenetv2"
"efficientnetb0", "efficientnetb1", "efficientnetb2", "efficientnetb3",
"efficientnetb4", "efficientnetb5", "efficientnetb6", "efficientnetb7"
Defaults to
"mobilenetv2"
- Type:
str
- decoder_filters#
A tuple of integers denoting the number of filters to use in the upsampling blocks of the decoder, starting from the lowest resolution block. The length of this attribute also specifies the number of upsampling steps and therefore the output stride of the backbone. Specify 5 filter numbers to get an output stride of 1 (same size as the input). Defaults to
- Type:
Tuple[int]
- pretrained#
If
True
(the default), load pretrained weights for the encoder. IfFalse
, the same model architecture will be used for the encoder but the weights will be randomly initialized.- Type:
bool
- property down_blocks: int#
Return the number of downsampling blocks in the encoder.
- classmethod from_config(config: PretrainedEncoderConfig) UnetPretrainedEncoder [source]#
Create the backbone from a configuration.
- Parameters:
config – A
PretrainedEncoderConfig
instance specifying the configuration of the backbone.- Returns:
An instantiated
UnetPretrainedEncoder
.
- make_backbone(x_in: Tensor) Tuple[Tensor, List[IntermediateFeature]] [source]#
Create the backbone and return the output tensors for building a model.
- Parameters:
x_in – A
tf.Tensor
representing the input to this backbone. This is typically an instance oftf.keras.layers.Input()
but can also be any rank-4 tensor. Can be grayscale or RGB.- Returns:
A tuple of (
x_main
,intermediate_activations
).x_main
is the output tensor from the last upsampling block.intermediate_activations
is a list of `IntermediateActivation`s containing tensors with the outputs from each block of the decoder for use in building multi-output models at different feature strides.
- property maximum_stride: int#
Return the maximum encoder stride relative to the input.
- property output_stride: int#
Return the stride of the output of the decoder.
- property up_blocks: int#
Return the number of upsampling blocks in the decoder.