sleap.nn.architectures.hourglass#
This module provides a generalized implementation of (stacked) hourglass.
See the Hourglass
class docstring for more information.
- class sleap.nn.architectures.hourglass.DownsamplingBlock(pool: bool = True, pooling_stride: int = 2, filters: int = 256)[source]#
Convolutional downsampling block of the hourglass.
This block is the simplified convolution-only block described in the Associative Embedding paper, not the original residual blocks used in the original hourglass paper. This block is simpler and demonstrated similar performance to the residual block.
- The structure of this block is simply:
MaxPool(stride 2) -> Conv(3 x 3 x filters)
- filters#
Number of filters in the convolutional layer of the block.
- Type:
int
- make_block(x_in: Tensor, prefix: str = 'downsample') Tensor [source]#
Create the block from an input tensor.
- Parameters:
x_in – Input tensor to the block.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns:
The output tensor after applying all operations in the block.
- class sleap.nn.architectures.hourglass.Hourglass(down_blocks: int = 4, up_blocks: int = 4, stem_filters: int = 128, stem_stride: int = 4, filters: int = 256, filter_increase: int = 128, interp_method: str = 'nearest', stacks: int = 3)[source]#
Encoder-decoder definition of the (stacked) hourglass network backbone.
This implements the architecture of the Associative Embedding paper, which improves upon the architecture in the original hourglass paper. The primary changes are to replace the residual block with simple convolutions and modify the filter sizes.
- The basic structure of this backbone is:
x_in -> stem -> {encoder_stack -> decoder_stack} * stacks -> x_out
- down_blocks#
Number of downsampling blocks. The original implementation has 4 downsampling blocks.
- Type:
int
- up_blocks#
Number of upsampling blocks. The original implementation is symmetric and has 4 upsampling blocks. If specifying more than 1 stack, this should be equal to down_blocks.
- Type:
int
- stem_filters#
Number of filters to output from the stem block. This block of convolutions will not be repeated across stacks, so it serves as a convenient way to reduce the input image size while extracting fine-scale image features. In the original implementation this is 128.
- Type:
int
- stem_stride#
Stride of the stem block. This can be set to 1, 2 or 4. If >1, this increases the spatial receptive field at the cost of losing fine details at higher resolution. In the original implementation this is 4.
- Type:
int
- filters#
Base number of filters. This will be the number of filters in the first block, where subsequent blocks will have an linearly increasing number of filters (see
filter_increase
). In the original implementation this is 256.- Type:
int
- filter_increase#
Number to increment the number of filters in each subsequent block by. This number is added, not multiplied, at each block. In the original implementation this is 128.
- Type:
int
- interp_method#
Method for interpolation in the upsampling blocks. In the original implementation this is nearest neighbor interpolation. Valid values are “nearest” or “bilinear”.
- Type:
str
- stacks#
Number of repeated stacks of symmetric downsampling -> upsampling stacks. Intermediate outputs are returned which can be used to apply intermediate supervision.
- Type:
int
- property decoder_stack: List[DecoderBlock]#
Define decoder stack configuration.
- property encoder_stack: List[EncoderBlock]#
Define encoder stack configuration.
- classmethod from_config(config: HourglassConfig) Hourglass [source]#
Create a model from a set of configuration parameters.
- Parameters:
config – An
HourglassConfig
instance with the desired parameters.- Returns:
An instance of this class with the specified configuration.
- property stem_stack: List[EncoderBlock]#
Define stem stack configuration.
- class sleap.nn.architectures.hourglass.StemBlock(pool: bool = True, pooling_stride: int = 4, filters: int = 128, output_filters: int = 256)[source]#
Stem layers of the hourglass. These are not repeated with multiple stacks.
- The default structure of this block is:
Conv(7 x 7 x filters, stride 2) -> Conv(3 x 3 x 2*filters) -> MaxPool(stride 2) -> Conv(3 x 3 x output_filters)
- pool#
If True, pooling is applied. See
pooling_stride
.- Type:
bool
- pooling_stride#
Determines how much pooling is applied within the stem block. If set to 1, no pooling is applied. If set to 2, the max pooling layer will have a stride of 2. If set to 4, the first convolution and the max pooling layer will both have a stride of 2.
- Type:
int
- filters#
Base number of convolutional filters.
- Type:
int
- output_filters#
Number of filters to output at the end of this block.
- Type:
int
- first_conv_stride#
Stride of the first convolutional layer. Set to 1 to increase the effective spatial resolution of the initial activations.
- make_block(x_in: Tensor, prefix: str = 'stem') Tensor [source]#
Create the block from an input tensor.
- Parameters:
x_in – Input tensor to the block.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns:
The output tensor after applying all operations in the block.
- class sleap.nn.architectures.hourglass.UpsamplingBlock(upsampling_stride: int = 2, filters: int = 256, interp_method: str = 'bilinear')[source]#
Upsampling block that integrates skip connections with refinement.
This block implements both the intermediate block after the skip connection from the downsampling path, as well as the upsampling block from the main network backbone path.
- The structure of this block is:
x_in -> Conv(3 x 3 x filters) -> Upsample -> x_up skip_in -> Conv(3 x 3 x filters) -> x_middle x_up + x_middle -> x_out
- filters#
Number of filters in the output tensor.
- Type:
int
- interp_method#
Interpolation method for the upsampling step. In the original implementation, nearest neighbor interpolation was used. Valid values are “nearest” or “bilinear”.
- Type:
str
- make_block(x: Tensor, current_stride: int | None = None, skip_source: IntermediateFeature | None = None, prefix: str = 'upsample') Tensor [source]#
Instantiate the upsampling block from an input tensor.
- Parameters:
x_in – Input tensor to the block.
current_stride – The stride of input tensor.
skip_source – A tensor that will be used to form a skip connection if the block is configured to use it.
prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.
- Returns:
The output tensor after applying all operations in the block.
- sleap.nn.architectures.hourglass.conv(x: Tensor, filters: int, kernel_size: int = 3, stride: int = 1, prefix: str = 'conv') Tensor [source]#
Apply basic convolution with ReLU and batch normalization.
- Parameters:
x – Input tensor.
filters – Number of convolutional filters (output channels).
kernel_size – Size (height == width) of convolutional kernel.
stride – Striding of convolution. If >1, the output is smaller than the input.
prefix – String to prepend to the sublayers of this convolution.
- Returns:
The output tensor after applying convolution and batch normalization.