sleap.nn.data.resizing#
Transformers for image resizing and padding.
- class sleap.nn.data.resizing.PointsRescaler(points_key: str = 'predicted_instances', scale_key: str = 'scale', invert: bool = True)[source]#
Transformer to apply or invert scaling operations on points.
- property input_keys: List[str]#
Return the keys that incoming elements are expected to have.
- property output_keys: List[str]#
Return the keys that outgoing elements will have.
- class sleap.nn.data.resizing.Resizer(image_key: str = 'image', scale_key: str = 'scale', points_key: Optional[str] = 'instances', scale: float = 1.0, pad_to_stride: int = 1, keep_full_image: bool = False, full_image_key: str = 'full_image')[source]#
Data transformer to resize or pad images.
This is useful as a transformation to data streams that require resizing or padding in order to be downsampled or meet divisibility criteria.
- image_key#
String name of the key containing the images to resize.
- Type:
str
- scale_key#
String name of the key containing the scale of the images.
- Type:
str
- points_key#
String name of the key containing points to adjust for the resizing operation.
- Type:
Optional[str]
- scale#
Scalar float specifying scaling factor to resize images by.
- Type:
float
- pad_to_stride#
Maximum stride in a model that the images must be divisible by. If > 1, this will pad the bottom and right of the images to ensure they meet this divisibility criteria. Padding is applied after the scaling specified in the
scale
attribute.- Type:
int
- keep_full_image#
If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.
- Type:
bool
- full_image_key#
String name of the key containing the full images.
- Type:
str
- classmethod from_config(config: PreprocessingConfig, image_key: str = 'image', scale_key: str = 'scale', pad_to_stride: Optional[int] = None, keep_full_image: bool = False, full_image_key: str = 'full_image', points_key: Optional[str] = 'instances') Resizer [source]#
Build an instance of this class from its configuration options.
- Parameters:
config – An
PreprocessingConfig
instance with the desired parameters. Ifconfig.pad_to_stride
is not an explicit integer, thepad_to_stride
parameter must be provided.image_key – String name of the key containing the images to resize.
scale_key – String name of the key containing the scale of the images.
pad_to_stride – An integer specifying the
pad_to_stride
ifconfig.pad_to_stride
is not an explicit integer (e.g., set to None).keep_full_image – If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.
full_image_key – String name of the key containing the full images.
points_key – String name of the key containing points to adjust for the resizing operation.
- Returns:
An instance of this class.
- Raises:
ValueError – If
config.pad_to_stride
is not set to an integer and thepad_to_stride
argument is not provided.
- property input_keys: List[str]#
Return the keys that incoming elements are expected to have.
- property output_keys: List[str]#
Return the keys that outgoing elements will have.
- transform_dataset(ds_input: DatasetV2) DatasetV2 [source]#
Create a dataset that contains centroids computed from the inputs.
- Parameters:
ds_input – A dataset with the image specified in the
image_key
attribute, points specified in thepoints_key
attribute, and the “scale” key for tracking scaling transformations.- Returns:
A
tf.data.Dataset
with elements containing the same images and points with resizing applied.The “scale” key of the example will be multipled by the
scale
attribute of this transformer.If the
keep_full_image
attribute is True, a key specified byfull_image_key
will be added with the to the example containing the image before any processing.
- class sleap.nn.data.resizing.SizeMatcher(image_key: str = 'image', scale_key: str = 'scale', points_key: Optional[str] = 'instances', keep_full_image: bool = False, full_image_key: str = 'full_image', max_image_height: Optional[int] = None, max_image_width: Optional[int] = None, center_pad: bool = False)[source]#
Data transformer that ensures output images have uniform shape by resizing/padding smaller images.
- image_key#
String name of the key containing the images to resize.
- Type:
str
- scale_key#
String name of the key containing the scale of the images.
- Type:
str
- points_key#
String name of the key containing points to adjust for the resizing operation.
- Type:
Optional[str]
- keep_full_image#
If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.
- Type:
bool
- full_image_key#
String name of the key containing the full images.
- Type:
str
- max_image_height#
int The target height to which all smaller images will be resized/padded to.
- Type:
int
- max_image_width#
int The target width to which all smaller images will be resized/padded to.
- Type:
int
- center_pad#
If True, pad on the left/top to center the non-zero pixels rather than aligning them to the top-left. The offsets will be stored in the “offset_x” and “offset_y” keys.
- Type:
bool
- classmethod from_config(config: PreprocessingConfig, provider: Optional[Provider] = None, update_config: bool = True, image_key: str = 'image', scale_key: str = 'scale', keep_full_image: bool = False, full_image_key: str = 'full_image', points_key: Optional[str] = 'instances') SizeMatcher [source]#
Build an instance of this class from configuration.
- Parameters:
config – An
PreprocessingConfig
instance with the desired parameters. Ifconfig.resize_and_pad_to_target
isTrue
and ‘target_height’ / ‘target_width’ are not set, provider needs to be set that implements ‘max_height_and_width’.provider – Data provider.
update_config – If True, the input model configuration will be updated with values inferred from other fields.
image_key – String name of the key containing the images to resize.
scale_key – String name of the key containing the scale of the images.
pad_to_stride – An integer specifying the
pad_to_stride
ifconfig.pad_to_stride
is not an explicit integer (e.g., set to None).keep_full_image – If True, keeps the (original size) full image in the examples. This is useful for multi-scale inference.
full_image_key – String name of the key containing the full images.
points_key – String name of the key containing points to adjust for the resizing operation.
- Returns:
An instance of this class.
- property input_keys: List[str]#
Return the keys that incoming elements are expected to have.
- property output_keys: List[str]#
Return the keys that outgoing elements will have.
- transform_dataset(ds_input: DatasetV2) DatasetV2 [source]#
Transform a dataset with variable size images into one with fixed sizes.
- Parameters:
ds_input – A dataset with the image specified in the
image_key
attribute, points specified in thepoints_key
attribute, and the “scale” key for tracking scaling transformations.- Returns:
A
tf.data.Dataset
with elements containing the same images and points of equal size.If the
keep_full_image
attribute is True, a key specified byfull_image_key
will be added to the example containing the image before any processing.
- sleap.nn.data.resizing.find_padding_for_stride(image_height: int, image_width: int, max_stride: int) Tuple[int, int] [source]#
Compute padding required to ensure image is divisible by a stride.
This function is useful for determining how to pad images such that they will not have issues with divisibility after repeated pooling steps.
- Parameters:
image_height – Scalar integer specifying the image height (rows).
image_width – Scalar integer specifying the image height (columns).
max_stride – Scalar integer specifying the maximum stride that the image must be divisible by.
- Returns:
A tuple of (pad_bottom, pad_right), integers with the number of pixels that the image would need to be padded by to meet the divisibility requirement.
- sleap.nn.data.resizing.pad_to_stride(image: Tensor, max_stride: int) Tensor #
Pad an image to meet a max stride constraint.
This is useful for ensuring there is no size mismatch between an image and the output tensors after multiple downsampling and upsampling steps.
- Parameters:
image – Single image tensor of shape (height, width, channels).
max_stride – Scalar integer specifying the maximum stride that the image must be divisible by. This is the ratio between the length of the image and the length of the smallest tensor it is converted to. This is typically
2 ** n_down_blocks
, wheren_down_blocks
is the number of 2-strided reduction layers in the model.
- Returns:
The input image with 0-padding applied to the bottom and/or right such that the new shape’s height and width are both divisible by
max_stride
.
- sleap.nn.data.resizing.resize_image(image: Tensor, scale: Tensor) Tensor [source]#
Rescale an image by a scale factor.
This function is primarily a convenience wrapper for
tf.image.resize
that calculates the new shape from the scale factor.- Parameters:
image – Single image tensor of shape (height, width, channels).
scale – Factor to resize the image dimensions by, specified as either a float scalar or as a 2-tuple of [scale_x, scale_y]. If a scalar is provided, both dimensions are resized by the same factor.
- Returns:
The resized image tensor of the same dtype but scaled height and width.
See also: tf.image.resize