sleap.nn.data.resizing#

Transformers for image resizing and padding.

class sleap.nn.data.resizing.PointsRescaler(points_key: str = 'predicted_instances', scale_key: str = 'scale', invert: bool = True)[source]#

Transformer to apply or invert scaling operations on points.

property input_keys: List[str]#

Return the keys that incoming elements are expected to have.

property output_keys: List[str]#

Return the keys that outgoing elements will have.

transform_dataset(input_ds: DatasetV2) DatasetV2[source]#

Create a dataset that contains instance cropped data.

class sleap.nn.data.resizing.Resizer(image_key: str = 'image', scale_key: str = 'scale', points_key: str | None = 'instances', scale: float = 1.0, pad_to_stride: int = 1, keep_full_image: bool = False, full_image_key: str = 'full_image')[source]#

Data transformer to resize or pad images.

This is useful as a transformation to data streams that require resizing or padding in order to be downsampled or meet divisibility criteria.

image_key#

String name of the key containing the images to resize.

Type:

str

scale_key#

String name of the key containing the scale of the images.

Type:

str

points_key#

String name of the key containing points to adjust for the resizing operation.

Type:

str | None

scale#

Scalar float specifying scaling factor to resize images by.

Type:

float

pad_to_stride#

Maximum stride in a model that the images must be divisible by. If > 1, this will pad the bottom and right of the images to ensure they meet this divisibility criteria. Padding is applied after the scaling specified in the scale attribute.

Type:

int

keep_full_image#

If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.

Type:

bool

full_image_key#

String name of the key containing the full images.

Type:

str

classmethod from_config(config: PreprocessingConfig, image_key: str = 'image', scale_key: str = 'scale', pad_to_stride: int | None = None, keep_full_image: bool = False, full_image_key: str = 'full_image', points_key: str | None = 'instances') Resizer[source]#

Build an instance of this class from its configuration options.

Parameters:
  • config – An PreprocessingConfig instance with the desired parameters. If config.pad_to_stride is not an explicit integer, the pad_to_stride parameter must be provided.

  • image_key – String name of the key containing the images to resize.

  • scale_key – String name of the key containing the scale of the images.

  • pad_to_stride – An integer specifying the pad_to_stride if config.pad_to_stride is not an explicit integer (e.g., set to None).

  • keep_full_image – If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.

  • full_image_key – String name of the key containing the full images.

  • points_key – String name of the key containing points to adjust for the resizing operation.

Returns:

An instance of this class.

Raises:

ValueError – If config.pad_to_stride is not set to an integer and the pad_to_stride argument is not provided.

property input_keys: List[str]#

Return the keys that incoming elements are expected to have.

property output_keys: List[str]#

Return the keys that outgoing elements will have.

transform_dataset(ds_input: DatasetV2) DatasetV2[source]#

Create a dataset that contains centroids computed from the inputs.

Parameters:

ds_input – A dataset with the image specified in the image_key attribute, points specified in the points_key attribute, and the “scale” key for tracking scaling transformations.

Returns:

A tf.data.Dataset with elements containing the same images and points with resizing applied.

The “scale” key of the example will be multipled by the scale attribute of this transformer.

If the keep_full_image attribute is True, a key specified by full_image_key will be added with the to the example containing the image before any processing.

class sleap.nn.data.resizing.SizeMatcher(image_key: str = 'image', scale_key: str = 'scale', points_key: str | None = 'instances', keep_full_image: bool = False, full_image_key: str = 'full_image', max_image_height: int | None = None, max_image_width: int | None = None, center_pad: bool = False)[source]#

Data transformer that ensures output images have uniform shape by resizing/padding smaller images.

image_key#

String name of the key containing the images to resize.

Type:

str

scale_key#

String name of the key containing the scale of the images.

Type:

str

points_key#

String name of the key containing points to adjust for the resizing operation.

Type:

str | None

keep_full_image#

If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.

Type:

bool

full_image_key#

String name of the key containing the full images.

Type:

str

max_image_height#

int The target height to which all smaller images will be resized/padded to.

Type:

int

max_image_width#

int The target width to which all smaller images will be resized/padded to.

Type:

int

center_pad#

If True, pad on the left/top to center the non-zero pixels rather than aligning them to the top-left. The offsets will be stored in the “offset_x” and “offset_y” keys.

Type:

bool

classmethod from_config(config: PreprocessingConfig, provider: Provider | None = None, update_config: bool = True, image_key: str = 'image', scale_key: str = 'scale', keep_full_image: bool = False, full_image_key: str = 'full_image', points_key: str | None = 'instances') SizeMatcher[source]#

Build an instance of this class from configuration.

Parameters:
  • config – An PreprocessingConfig instance with the desired parameters. If config.resize_and_pad_to_target is True and ‘target_height’ / ‘target_width’ are not set, provider needs to be set that implements ‘max_height_and_width’.

  • provider – Data provider.

  • update_config – If True, the input model configuration will be updated with values inferred from other fields.

  • image_key – String name of the key containing the images to resize.

  • scale_key – String name of the key containing the scale of the images.

  • pad_to_stride – An integer specifying the pad_to_stride if config.pad_to_stride is not an explicit integer (e.g., set to None).

  • keep_full_image – If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.

  • full_image_key – String name of the key containing the full images.

  • points_key – String name of the key containing points to adjust for the resizing operation.

Returns:

An instance of this class.

property input_keys: List[str]#

Return the keys that incoming elements are expected to have.

property output_keys: List[str]#

Return the keys that outgoing elements will have.

transform_dataset(ds_input: DatasetV2) DatasetV2[source]#

Transform a dataset with variable size images into one with fixed sizes.

Parameters:

ds_input – A dataset with the image specified in the image_key attribute, points specified in the points_key attribute, and the “scale” key for tracking scaling transformations.

Returns:

A tf.data.Dataset with elements containing the same images and points of equal size.

If the keep_full_image attribute is True, a key specified by full_image_key will be added to the example containing the image before any processing.

sleap.nn.data.resizing.find_padding_for_stride(image_height: int, image_width: int, max_stride: int) Tuple[int, int][source]#

Compute padding required to ensure image is divisible by a stride.

This function is useful for determining how to pad images such that they will not have issues with divisibility after repeated pooling steps.

Parameters:
  • image_height – Scalar integer specifying the image height (rows).

  • image_width – Scalar integer specifying the image height (columns).

  • max_stride – Scalar integer specifying the maximum stride that the image must be divisible by.

Returns:

A tuple of (pad_bottom, pad_right), integers with the number of pixels that the image would need to be padded by to meet the divisibility requirement.

sleap.nn.data.resizing.pad_to_stride(image: Tensor, max_stride: int) Tensor#

Pad an image to meet a max stride constraint.

This is useful for ensuring there is no size mismatch between an image and the output tensors after multiple downsampling and upsampling steps.

Parameters:
  • image – Single image tensor of shape (height, width, channels).

  • max_stride – Scalar integer specifying the maximum stride that the image must be divisible by. This is the ratio between the length of the image and the length of the smallest tensor it is converted to. This is typically 2 ** n_down_blocks, where n_down_blocks is the number of 2-strided reduction layers in the model.

Returns:

The input image with 0-padding applied to the bottom and/or right such that the new shape’s height and width are both divisible by max_stride.

sleap.nn.data.resizing.resize_image(image: Tensor, scale: Tensor) Tensor[source]#

Rescale an image by a scale factor.

This function is primarily a convenience wrapper for tf.image.resize that calculates the new shape from the scale factor.

Parameters:
  • image – Single image tensor of shape (height, width, channels).

  • scale – Factor to resize the image dimensions by, specified as either a float scalar or as a 2-tuple of [scale_x, scale_y]. If a scalar is provided, both dimensions are resized by the same factor.

Returns:

The resized image tensor of the same dtype but scaled height and width.

See also: tf.image.resize