sleap.nn.architectures.encoder_decoder#
Generic encoder-decoder fully convolutional backbones.
This module contains building blocks for creating encoder-decoder architectures of general form.
The encoder branch of the network forms the initial multi-scale feature extraction via repeated blocks of convolutions and pooling steps.
The decoder branch is then responsible for upsampling the low resolution feature maps to achieve the target output stride.
This pattern is generalizable and describes most fully convolutional architectures. For .. rubric:: Example
simple convolutions with pooling form the structure in `LEAP CNN
- <https://www.nature.com/articles/s41592-018-0234-5>`_;
adding skip connections forms U-Net;
using residual blocks with skip connections forms the base module in `stacked
hourglass <https://arxiv.org/pdf/1603.06937.pdf>`_; - using dense blocks with skip connections forms `FC-DenseNet
<https://arxiv.org/pdf/1611.09326.pdf>`_.
This module implements blocks used in all of these variants on top of a generic base classes.
See the EncoderDecoder
base class for requirements for creating new architectures.
- class sleap.nn.architectures.encoder_decoder.DecoderBlock(upsampling_stride: int = 2)[source]#
Base class for decoder blocks.
- upsampling_stride#
The striding of the upsampling layer. This is typically set to 2, such that the input tensor doubles in size after the block, but can be set higher to upsample in fewer steps.
- Type
int
- make_block(x: tensorflow.python.framework.ops.Tensor, current_stride: Optional[int], skip_source: Optional[tensorflow.python.framework.ops.Tensor] = None, prefix: str = 'upsample') tensorflow.python.framework.ops.Tensor [source]#
Instantiate the decoder block from an input tensor.
- Parameters
x_in – Input tensor to the block.
current_stride – The stride of input tensor.
skip_source – A tensor that will be used to form a skip connection if the block is configured to use it.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns
The output tensor after applying all operations in the block.
- class sleap.nn.architectures.encoder_decoder.EncoderBlock(pool: bool = True, pooling_stride: int = 2)[source]#
Base class for encoder blocks.
- pool#
If True, applies max pooling at the end of the block.
- Type
bool
- pooling_stride#
Stride of the max pooling operation. If 1, the output of this block will be at the same stride (== 1/scale) as the input.
- Type
int
- class sleap.nn.architectures.encoder_decoder.EncoderDecoder(stacks: int = 1)[source]#
General encoder-decoder base class.
New architectures that follow the encoder-decoder pattern can be defined by inheriting from this class and implementing the
encoder_stack
anddecoder_stack
methods.- stacks#
If greater than 1, the encoder-decoder architecture will be repeated.
- Type
int
- property decoder_features_stride: int#
Return the relative stride of the final output of the decoder.
This is equivalent to the stride of the decoder assuming that it is constructed from an input with stride 1.
- property decoder_stack: Sequence[sleap.nn.architectures.encoder_decoder.DecoderBlock]#
Return a list of decoder blocks that define the decoder.
- property encoder_features_stride: int#
Return the relative stride of the final output of the encoder.
This is equivalent to the stride of the encoder assuming that it is constructed from an input with stride 1.
- property encoder_stack: Sequence[sleap.nn.architectures.encoder_decoder.EncoderBlock]#
Return a list of encoder blocks that define the encoder.
- make_backbone(x_in: tensorflow.python.framework.ops.Tensor, current_stride: int = 1) Union[Tuple[tensorflow.python.framework.ops.Tensor, List[sleap.nn.architectures.common.IntermediateFeature]], Tuple[List[tensorflow.python.framework.ops.Tensor], List[List[sleap.nn.architectures.common.IntermediateFeature]]]] [source]#
Instantiate the entire encoder-decoder backbone.
- Parameters
x_in – The input tensor.
current_stride – The stride of
x_in
relative to the original input. This is 1 if the input tensor comes from the input layer of the network. If not, this must be set appropriately in order to match up intermediate tensors during decoder construction.
- Returns
A tuple of the final output tensor of the decoder and a list of `IntermediateFeature`s.
The intermediate features contain the output tensors from every block except the last. This includes the input to this function (
x_in
). These are useful when defining heads that take inputs at multiple scales.If the architecture has more than 1 stack, the outputs are each lists of output tensors and intermediate features corresponding to each stack.
- make_decoder(x_in: tensorflow.python.framework.ops.Tensor, current_stride: int, skip_source_features: Optional[Sequence[sleap.nn.architectures.common.IntermediateFeature]] = None, prefix: str = 'dec') Tuple[tensorflow.python.framework.ops.Tensor, List[sleap.nn.architectures.common.IntermediateFeature]] [source]#
Instantiate the encoder layers defined by the decoder stack configuration.
- Parameters
x_in – The input tensor.
current_stride – The stride of
x_in
relative to the original input. This is the stride of the output of the encoder relative to the original input.skip_source_features – A sequence of `IntermediateFeature`s containing tensors that can be used to form skip connections at matching strides. At every decoder block, the first skip source feature found at the input stride of the block will be passed to the block instantiation method. If the decoder block is not configured to form skip connections, these will be ignored even if found.
prefix – String prefix for naming decoder layers.
- Returns
A tuple of the final output tensor of the decoder and a list of `IntermediateFeature`s.
The intermediate features contain the output tensors from every block except the last. This includes the input to this function (
x_in
). These are useful when defining heads that take inputs at multiple scales.
- make_encoder(x_in: tensorflow.python.framework.ops.Tensor, current_stride: int, prefix: str = 'enc') Tuple[tensorflow.python.framework.ops.Tensor, List[sleap.nn.architectures.common.IntermediateFeature]] [source]#
Instantiate the encoder layers defined by the encoder stack configuration.
- Parameters
x_in – The input tensor.
current_stride – The stride of
x_in
relative to the original input. If any pooling was performed before the encoder, this must be specified in order to appropriately set the stride in the returned intermediate features.prefix – String prefix for naming encoder layers.
- Returns
A tuple of the final output tensor of the encoder and a list of `IntermediateFeature`s.
The intermediate features contain the output tensors from every block except the last. These can be reused in the decoder to form skip connections.
- make_stem(x_in: tensorflow.python.framework.ops.Tensor, prefix: str = 'stem') tensorflow.python.framework.ops.Tensor [source]#
Instantiate the stem layers defined by the stem block configuration.
Unlike in the encoder, the stem layers do not get repeated in stacked models.
- Parameters
x_in – The input tensor.
current_stride – The stride of
x_in
relative to the original input. If any pooling was performed before the stem, this must be specified in order to appropriately set the stride in the rest of the model.prefix – String prefix for naming stem layers.
- Returns
The final output tensor of the stem.
- property maximum_stride: int#
Return the maximum stride that the input must be divisible by.
- property output_stride: int#
Return stride of the output of the backbone.
- property stem_features_stride: int#
Return the relative stride of the final output of the stem block.
This is equivalent to the stride of the stem assuming that it is constructed from an input with stride 1.
- property stem_stack: Optional[Sequence[sleap.nn.architectures.encoder_decoder.EncoderBlock]]#
Return a list of encoder blocks that define the stem.
- class sleap.nn.architectures.encoder_decoder.SimpleConvBlock(pool: bool = True, pooling_stride: int = 2, pool_before_convs: bool = False, num_convs: int = 2, filters: int = 32, kernel_size: int = 3, use_bias: bool = True, batch_norm: bool = False, batch_norm_before_activation: bool = True, activation: str = 'relu', block_prefix: str = '')[source]#
Flexible block of convolutions and max pooling.
- pool#
If True, applies max pooling at the end of the block.
- Type
bool
- pooling_stride#
Stride of the max pooling operation. If 1, the output of this block will be at the same stride (== 1/scale) as the input.
- Type
int
- pool_before_convs#
If True, max pooling is performed before convolutions.
- Type
bool
- num_convs#
Number of convolution layers with activation. All attributes below are the same for all convolution layers within the block.
- Type
int
- filters#
Number of convolutional kernel filters.
- Type
int
- kernel_size#
Size of convolutional kernels (== height == width).
- Type
int
- use_bias#
If False, convolution layers will not have a bias term.
- Type
bool
- batch_norm#
If True, applies batch normalization after each convolution.
- Type
bool
- batch_norm_before_activation#
If True, batch normalization is applied to the features computed from the linear convolution operation before the activation function, i.e.:
conv -> BN -> activation function
- If False, the mini-block will look like:
conv -> activation function -> BN
- Type
bool
- activation#
Name of activation function (typically “relu” or “linear”).
- Type
str
- block_prefix#
String to append to the prefix provided at block creation time.
- Type
str
Note
This block is used in LeapCNN and UNet.
- make_block(x_in: tensorflow.python.framework.ops.Tensor, prefix: str = 'conv_block') tensorflow.python.framework.ops.Tensor [source]#
Create the block from an input tensor.
- Parameters
x_in – Input tensor to the block.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns
The output tensor after applying all operations in the block.
- class sleap.nn.architectures.encoder_decoder.SimpleUpsamplingBlock(upsampling_stride: int = 2, transposed_conv: bool = False, transposed_conv_filters: int = 64, transposed_conv_kernel_size: int = 3, transposed_conv_use_bias: bool = True, transposed_conv_batch_norm: bool = True, transposed_conv_batch_norm_before_activation: bool = True, transposed_conv_activation: str = 'relu', interp_method: str = 'bilinear', skip_connection: bool = False, skip_add: bool = False, refine_convs: int = 2, refine_convs_first_filters: Optional[int] = None, refine_convs_filters: int = 64, refine_convs_use_bias: bool = True, refine_convs_kernel_size: int = 3, refine_convs_batch_norm: bool = True, refine_convs_batch_norm_before_activation: bool = True, refine_convs_activation: str = 'relu')[source]#
Standard block of upsampling with optional refinement and skip connections.
- upsampling_stride#
The striding of the upsampling layer. This is typically set to 2, such that the input tensor doubles in size after the block, but can be set higher to upsample in fewer steps.
- Type
int
- transposed_conv#
If True, use a strided transposed convolution to perform learnable upsampling. If False, interpolated upsampling will be used (see
interp_method
) andtransposed_conv_*
attributes will have no effect.- Type
bool
- transposed_conv_filters#
Integer that specifies the number of filters in the transposed convolution layer.
- Type
int
- transposed_conv_kernel_size#
Size of the kernel for the transposed convolution.
- Type
int
- transposed_conv_use_bias#
If False, transposed convolution layers will not have a bias term.
- Type
bool
- transposed_conv_batch_norm#
If True, applies batch normalization after the transposed convolution.
- Type
bool
- transposed_conv_batch_norm_before_activation#
If True, batch normalization is applied to the features computed from the linear transposed convolution operation before the activation function, i.e.:
transposed conv -> BN -> activation function
- If False, the mini-block will look like:
transposed conv -> activation function -> BN
- Type
bool
- transposed_conv_activation#
Name of activation function (typically “relu” or “linear”).
- Type
str
- interp_method#
String specifying the type of interpolation to use if
transposed_conv
is set to False. This can bebilinear
ornearest
. Seetf.keras.layers.UpSampling2D
for more details on the implementation.- Type
str
- skip_connection#
If True, the block will form a skip connection with source features if provided during instantiation in the
make_block
method. If False, no skip connection will be formed even if a source feature is available.- Type
bool
- skip_add#
If True, the skip connection will be formed by adding the source feature to the output of the upsampling operation. If they have different number of channels, a 1x1 linear convolution will be applied to the source first (similar to residual shortcut connections). If False, the two tensors will be concatenated channelwise instead.
- Type
bool
- refine_convs#
If greater than 0, specifies the number of convolutions that will be applied after the upsampling step. These layers can serve the purpose of “mixing” the skip connection fused features, or to refine the current feature map after upsampling which can help to prevent aliasing and checkerboard effects. If 0, no additional convolutions will be applied after upsampling and the skip connection (if present) and all
refine_convs_*
attributes will have no effect. If greater than 1, all layers will be identical with respect to these attributes.- Type
int
- refine_convs_first_filters#
If not None, the first refinement conv layer will have this many filters, otherwise
refine_convs_filters
.- Type
Optional[int]
- refine_convs_filters#
Specifies the number of filters to use for the refinement convolutions.
- Type
int
- refine_convs_kernel_size#
Size of the kernel for the refinement convolution.
- Type
int
- refine_convs_use_bias#
If False, refinement convolution layers will not have a bias term.
- Type
bool
- refine_convs_batch_norm#
If True, applies batch normalization after each refinement convolution.
- Type
bool
- refine_convs_batch_norm_before_activation#
If True, batch normalization is applied to the features computed from each linear refinement convolution operation before the activation function, i.e.:
conv -> BN -> activation function
- If False, the mini-block will look like:
conv -> activation function -> BN
- Type
bool
- refine_convs_activation#
Name of activation function (typically “relu” or “linear”).
- Type
str
Note
This block is used in LeapCNN and UNet.
- make_block(x: tensorflow.python.framework.ops.Tensor, current_stride: Optional[int] = None, skip_source: Optional[tensorflow.python.framework.ops.Tensor] = None, prefix: str = 'upsample') tensorflow.python.framework.ops.Tensor [source]#
Instantiate the decoder block from an input tensor.
- Parameters
x_in – Input tensor to the block.
current_stride – The stride of input tensor. Not required but if provided, will be used to prepend the strides to the prefix.
skip_source – A tensor that will be used to form a skip connection if the block is configured to use it.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns
The output tensor after applying all operations in the block.