sleap.nn.data.instance_cropping#

Transformers for cropping instances for topdown processing.

class sleap.nn.data.instance_cropping.InstanceCropper(crop_width: int, crop_height: int, keep_full_image: bool = False, mock_centroid_confidence: bool = False, unbatch: bool = True, image_key: str = 'image', instances_key: str = 'instances', centroids_key: str = 'centroids')[source]#

Data transformer to crop and generate individual examples for instances.

This generates datasets that are instance cropped for topdown processing.

crop_width#

Width of the crops in pixels.

Type:

int

crop_height#

Height of the crops in pixels.

Type:

int

keep_full_image#

If True, the output examples will contain the full images provided as input to the instance cropped. This can be useful for pipelines that use both full and cropped images, at the cost of increased memory requirements usage. Setting this to False can substantially improve performance of large pipelines if the full images are no longer required.

Type:

bool

mock_centroid_confidence#

If True, add confidence keys for compatibility with predicted instance cropping.

Type:

bool

unbatch#

If True (the default), split frame-level examples into multiple instance-level examples. If False, all instance crops will be kept within the same example. Use this when building pipelines that require knowledge about all instances within a single example.

Type:

bool

image_key#

Name of the example key where the image is stored. Defaults to “image”.

Type:

str

instances_key#

Name of the example key where the instance points are stored. Defaults to “instances”.

Type:

str

centroids_key#

Name of the example key where the instance centroids are stored. Defaults to “centroids”.

Type:

str

classmethod from_config(config: InstanceCroppingConfig, crop_size: Optional[int] = None) InstanceCropper[source]#

Build an instance of this class from its configuration options.

Parameters:
  • config – An InstanceCroppingConfig instance with the desired parameters.

  • crop_size – Integer specifying the crop height and width. This is only required and will only be used if the config.crop_size attribute does not specify an explicit integer crop size (e.g., it is set to None).

Returns:

An instance of this class.

Raises:

ValueError – If the crop_size is not specified in either the config attribute or function arguments.

property input_keys: List[str]#

Return the keys that incoming elements are expected to have.

property output_keys: List[str]#

Return the keys that outgoing elements will have.

transform_dataset(input_ds: DatasetV2) DatasetV2[source]#

Create a dataset that contains instance cropped data.

Parameters:

ds_input

A dataset with examples containing the following keys: “image”: The full image in a tensor of shape (height, width, channels). “instances”: Instance points in a tf.float32 tensor of shape

(n_instances, n_nodes, 2).

”centroids”: The computed centroid for each instance in a tf.float32

tensor of shape (n_instances, 2).

”track_inds”: The track indices of the indices if available. Any additional keys present will be replicated in each output.

Returns:

A tf.data.Dataset with elements containing instance cropped data. Each instance will generate an example, so the total number of elements may change relative to the input dataset.

Each element in the output dataset will have the following keys:
”instance_image”: A cropped image of the same dtype as the input image

but with shape (crop_width, crop_height, channels) and will be centered on an instance.

”bbox”: The bounding box in absolute image coordinates in the format

(y1, x1, y2, x2) that resulted in the cropped image in “instance_image”. This will be a tf.float32 tensor of shape (4,).

”center_instance”: The points of the centered instance in image

coordinates in the “instance_image”. This will be a tf.float32 tensor of shape (n_nodes, 2). The absolute image coordinates can be recovered by adding (x1, y1) from the “bbox” key.

”center_instance_ind”: Scalar tf.int32 index of the centered instance

relative to all the instances in the frame. This can be used to index into additional keys that may contain data from all instances.

”track_ind”: Index of the track the instance belongs to if available. “all_instances”: The points of all instances in the frame in image

coordinates in the “instance_image”. This will be a tf.float32 tensor of shape (n_instances, n_nodes, 2). This is useful for multi- stage models that first predict all nodes and subsequently refine it to just the centered instance. The “center_instance_ind”-th row of this tensor is equal to “center_instance”.

”centroid”: The centroid coordinate that was used to generate this crop,

specified as a tf.float32 tensor of shape (2,) in absolute image coordinates.

”full_image_height”: The height of the full image from which the crop

was generated, specified as a scalar tf.int32 tensor.

”full_image_width”: The width of the full image from which the crop was

generated, specified as a scalar tf.int32 tensor.

If keep_full_image is True, examples will also have an “image” key containing the same image as the input.

if mock_centroid_confidence is True, examples will also have a “centroid_confidence” key with all ones. This is useful for evaluating models that use crops independently from centroid inference.

Additional keys will be replicated in each example under the same name.

sleap.nn.data.instance_cropping.crop_bboxes(image: Tensor, bboxes: Tensor) Tensor[source]#

Crop bounding boxes from an image.

This method serves as a convenience method for specifying the arguments of tf.image.crop_and_resize, becoming especially useful in the case of multiple bounding boxes with a single image and no resizing.

Parameters:
  • image – Tensor of shape (height, width, channels) of a single image.

  • bboxes – Tensor of shape (n_bboxes, 4) and dtype tf.float32, where the last axis corresponds to (y1, x1, y2, x2) coordinates of the bounding boxes. This can be generated from centroids using make_centered_bboxes.

Returns:

A tensor of shape (n_bboxes, crop_height, crop_width, channels) of the same dtype as the input image. The crop size is inferred from the bounding box coordinates.

Notes

This function expects bounding boxes with coordinates at the centers of the pixels in the box limits. Technically, the box will span (x1 - 0.5, x2 + 0.5) and (y1 - 0.5, y2 + 0.5).

For example, a 3x3 patch centered at (1, 1) would be specified by (y1, x1, y2, x2) = (0, 0, 2, 2). This would be exactly equivalent to indexing the image with image[0:3, 0:3].

See also: make_centered_bboxes

sleap.nn.data.instance_cropping.find_instance_crop_size(labels: Labels, padding: int = 0, maximum_stride: int = 2, input_scaling: float = 1.0, min_crop_size: Optional[int] = None) int[source]#

Compute the size of the largest instance bounding box from labels.

Parameters:
  • labels – A sleap.Labels containing user-labeled instances.

  • padding – Integer number of pixels to add to the bounds as margin padding.

  • maximum_stride – Ensure that the returned crop size is divisible by this value. Useful for ensuring that the crop size will not be truncated in a given architecture.

  • input_scaling – Float factor indicating the scale of the input images if any scaling will be done before cropping.

  • min_crop_size – The (optional) crop size set by the user. None if not set.

Returns:

An integer crop size denoting the length of the side of the bounding boxes that will contain the instances when cropped. The returned crop size will be larger or equal to the input crop_size.

This accounts for stride, padding and scaling when ensuring divisibility.

sleap.nn.data.instance_cropping.make_centered_bboxes(centroids: Tensor, box_height: int, box_width: int) Tensor[source]#

Generate bounding boxes centered on a set of centroid coordinates.

Parameters:
  • centroids – A tensor of shape (n_centroids, 2) and dtype tf.float32, where the last axis corresponds to the (x, y) coordinates of each centroid.

  • box_height – Scalar integer indicating the height of the bounding boxes.

  • box_width – Scalar integer indicating the width of the bounding boxes.

Returns:

Tensor of shape (n_centroids, 4) and dtype tf.float32, where the last axis corresponds to (y1, x1, y2, x2) coordinates of the bounding boxes in absolute image coordinates.

Notes

The bounding box coordinates are calculated such that the centroid coordinates map onto the center of the pixel. For example:

For a single row image of shape (1, 4) with values: [[a, b, c, d]], the x coordinates can be visualized in the diagram below:

a | b | c | d |
| | | | | | | |
-0.5 | 0.5 | 1.5 | 2.5 | 3.5

0 1 2 3

To get a (1, 3) patch centered at c, the centroid would be at (x, y) = (2, 0) with box height of 1 and box width of 3, to yield [[b, c, d]].

For even sized bounding boxes, e.g., to get the center 2 elements, the centroid would be at (x, y) = (1.5, 0) with box width of 2, to yield [[b, c]].

sleap.nn.data.instance_cropping.normalize_bboxes(bboxes: Tensor, image_height: int, image_width: int) Tensor[source]#

Normalize bounding box coordinates to the range [0, 1].

This is useful for transforming points for TensorFlow operations that require normalized image coordinates.

Parameters:
  • bboxes – Tensor of shape (n_bboxes, 4) and dtype tf.float32, where the last axis corresponds to (y1, x1, y2, x2) coordinates of the bounding boxes.

  • image_height – Scalar integer indicating the height of the image.

  • image_width – Scalar integer indicating the width of the image.

Returns:

Tensor of the normalized points of the same shape as bboxes.

The normalization applied to each point is x / (image_width - 1) and y / (image_width - 1).

See also: unnormalize_bboxes

sleap.nn.data.instance_cropping.unnormalize_bboxes(normalized_bboxes: Tensor, image_height: int, image_width: int) Tensor[source]#

Convert bounding boxes coordinates in the range [0, 1] to absolute coordinates.

Parameters:
  • normalized_bboxes – Tensor of shape (n_bboxes, 4) and dtype tf.float32, where the last axis corresponds to (y1, x1, y2, x2) normalized coordinates of the bounding boxes in the range [0, 1].

  • image_height – Scalar integer indicating the height of the image.

  • image_width – Scalar integer indicating the width of the image.

Returns:

Tensor of the same shape as bboxes mapped back to absolute image coordinates by multiplying (x, y) coordinates by (image_width - 1, image_height - 1).

See also: normalize_bboxes