sleap.nn.architectures.hourglass#

This module provides a generalized implementation of (stacked) hourglass.

See the Hourglass class docstring for more information.

class sleap.nn.architectures.hourglass.DownsamplingBlock(pool: bool = True, pooling_stride: int = 2, filters: int = 256)[source]#

Convolutional downsampling block of the hourglass.

This block is the simplified convolution-only block described in the Associative Embedding paper, not the original residual blocks used in the original hourglass paper. This block is simpler and demonstrated similar performance to the residual block.

The structure of this block is simply:

MaxPool(stride 2) -> Conv(3 x 3 x filters)

filters#

Number of filters in the convolutional layer of the block.

Type:

int

make_block(x_in: Tensor, prefix: str = 'downsample') Tensor[source]#

Create the block from an input tensor.

Parameters:
  • x_in – Input tensor to the block.

  • prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.

Returns:

The output tensor after applying all operations in the block.

class sleap.nn.architectures.hourglass.Hourglass(down_blocks: int = 4, up_blocks: int = 4, stem_filters: int = 128, stem_stride: int = 4, filters: int = 256, filter_increase: int = 128, interp_method: str = 'nearest', stacks: int = 3)[source]#

Encoder-decoder definition of the (stacked) hourglass network backbone.

This implements the architecture of the Associative Embedding paper, which improves upon the architecture in the original hourglass paper. The primary changes are to replace the residual block with simple convolutions and modify the filter sizes.

The basic structure of this backbone is:

x_in -> stem -> {encoder_stack -> decoder_stack} * stacks -> x_out

down_blocks#

Number of downsampling blocks. The original implementation has 4 downsampling blocks.

Type:

int

up_blocks#

Number of upsampling blocks. The original implementation is symmetric and has 4 upsampling blocks. If specifying more than 1 stack, this should be equal to down_blocks.

Type:

int

stem_filters#

Number of filters to output from the stem block. This block of convolutions will not be repeated across stacks, so it serves as a convenient way to reduce the input image size while extracting fine-scale image features. In the original implementation this is 128.

Type:

int

stem_stride#

Stride of the stem block. This can be set to 1, 2 or 4. If >1, this increases the spatial receptive field at the cost of losing fine details at higher resolution. In the original implementation this is 4.

Type:

int

filters#

Base number of filters. This will be the number of filters in the first block, where subsequent blocks will have an linearly increasing number of filters (see filter_increase). In the original implementation this is 256.

Type:

int

filter_increase#

Number to increment the number of filters in each subsequent block by. This number is added, not multiplied, at each block. In the original implementation this is 128.

Type:

int

interp_method#

Method for interpolation in the upsampling blocks. In the original implementation this is nearest neighbor interpolation. Valid values are “nearest” or “bilinear”.

Type:

str

stacks#

Number of repeated stacks of symmetric downsampling -> upsampling stacks. Intermediate outputs are returned which can be used to apply intermediate supervision.

Type:

int

property decoder_stack: List[DecoderBlock]#

Define decoder stack configuration.

property encoder_stack: List[EncoderBlock]#

Define encoder stack configuration.

classmethod from_config(config: HourglassConfig) Hourglass[source]#

Create a model from a set of configuration parameters.

Parameters:

config – An HourglassConfig instance with the desired parameters.

Returns:

An instance of this class with the specified configuration.

property stem_stack: List[EncoderBlock]#

Define stem stack configuration.

class sleap.nn.architectures.hourglass.StemBlock(pool: bool = True, pooling_stride: int = 4, filters: int = 128, output_filters: int = 256)[source]#

Stem layers of the hourglass. These are not repeated with multiple stacks.

The default structure of this block is:

Conv(7 x 7 x filters, stride 2) -> Conv(3 x 3 x 2*filters) -> MaxPool(stride 2) -> Conv(3 x 3 x output_filters)

pool#

If True, pooling is applied. See pooling_stride.

Type:

bool

pooling_stride#

Determines how much pooling is applied within the stem block. If set to 1, no pooling is applied. If set to 2, the max pooling layer will have a stride of 2. If set to 4, the first convolution and the max pooling layer will both have a stride of 2.

Type:

int

filters#

Base number of convolutional filters.

Type:

int

output_filters#

Number of filters to output at the end of this block.

Type:

int

first_conv_stride#

Stride of the first convolutional layer. Set to 1 to increase the effective spatial resolution of the initial activations.

make_block(x_in: Tensor, prefix: str = 'stem') Tensor[source]#

Create the block from an input tensor.

Parameters:
  • x_in – Input tensor to the block.

  • prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.

Returns:

The output tensor after applying all operations in the block.

class sleap.nn.architectures.hourglass.UpsamplingBlock(upsampling_stride: int = 2, filters: int = 256, interp_method: str = 'bilinear')[source]#

Upsampling block that integrates skip connections with refinement.

This block implements both the intermediate block after the skip connection from the downsampling path, as well as the upsampling block from the main network backbone path.

The structure of this block is:

x_in -> Conv(3 x 3 x filters) -> Upsample -> x_up skip_in -> Conv(3 x 3 x filters) -> x_middle x_up + x_middle -> x_out

filters#

Number of filters in the output tensor.

Type:

int

interp_method#

Interpolation method for the upsampling step. In the original implementation, nearest neighbor interpolation was used. Valid values are “nearest” or “bilinear”.

Type:

str

make_block(x: Tensor, current_stride: int | None = None, skip_source: IntermediateFeature | None = None, prefix: str = 'upsample') Tensor[source]#

Instantiate the upsampling block from an input tensor.

Parameters:
  • x_in – Input tensor to the block.

  • current_stride – The stride of input tensor.

  • skip_source – A tensor that will be used to form a skip connection if the block is configured to use it.

  • prefix – String that will be added to the name of every layer in the block. If not specified, instantiating this block multiple times may result in name conflicts if existing layers have the same name.

Returns:

The output tensor after applying all operations in the block.

sleap.nn.architectures.hourglass.conv(x: Tensor, filters: int, kernel_size: int = 3, stride: int = 1, prefix: str = 'conv') Tensor[source]#

Apply basic convolution with ReLU and batch normalization.

Parameters:
  • x – Input tensor.

  • filters – Number of convolutional filters (output channels).

  • kernel_size – Size (height == width) of convolutional kernel.

  • stride – Striding of convolution. If >1, the output is smaller than the input.

  • prefix – String to prepend to the sublayers of this convolution.

Returns:

The output tensor after applying convolution and batch normalization.