sleap.nn.data.providers#

Data providers for pipeline I/O.

class sleap.nn.data.providers.LabelsReader(labels: Labels, example_indices: Sequence[int] | ndarray | None = None, user_instances_only: bool = False, with_track_only: bool = False)[source]#

Data provider from a sleap.Labels instance.

This class can generate tf.data.Dataset`s from a set of labels for use in data pipelines. Each element in the dataset will contain the data contained in a single `LabeledFrame.

labels#

The sleap.Labels instance to generate data from.

Type:

sleap.io.dataset.Labels

example_indices#

List or numpy array of ints with the labeled frame indices to use when iterating over the labels. Use this to specify subsets of the labels to use. Particularly handy for creating data splits. If not provided, the entire labels dataset will be read. These indices will be applicable to the labeled frames in labels attribute, which may have changed in ordering or filtered.

Type:

Sequence[int] | numpy.ndarray | None

user_instances_only#

If True, load only user labeled instances. If False, all instances will be loaded.

Type:

bool

with_track_only#

If True, load only instances that have a track assigned. Useful when training supervised ID models.

Type:

bool

classmethod from_filename(filename: str, user_instances: bool = True) LabelsReader[source]#

Create a LabelsReader from a saved labels file.

Parameters:
  • filename – Path to a saved labels file.

  • user_instances – If True, will use only labeled frames with user instances.

Returns:

A LabelsReader instance that can create a dataset for pipelining.

classmethod from_unlabeled_suggestions(labels: Labels) LabelsReader[source]#

Create a LabelsReader using the unlabeled suggestions in a Labels set. :param labels: A sleap.Labels instance containing unlabeled suggestions.

Returns:

A LabelsReader instance that can create a dataset for pipelining.

classmethod from_user_instances(labels: Labels, with_track_only: bool = False) LabelsReader[source]#

Create a LabelsReader using the user instances in a Labels set. :param labels: A sleap.Labels instance containing user instances. :param with_track_only: If True, load only instances that have a track assigned.

Useful when training supervised ID models.

Returns:

A LabelsReader instance that can create a dataset for pipelining.

Notes

This will remove “empty” instances, i.e., instances with no visible points, in the original labels. Make a copy of the original labels if needed as they will be modified in place.

classmethod from_user_labeled_frames(labels: Labels) LabelsReader[source]#

Create a LabelsReader using the user labeled frames in a Labels set. :param labels: A sleap.Labels instance containing user labeled frames.

Returns:

A LabelsReader instance that can create a dataset for pipelining. Note that this constructor will load ALL instances in frames that have user instances. To load only user labeled instances, use LabelsReader.from_user_instances.

property is_from_multi_size_videos: bool#

Return True if labels contain videos with different sizes.

make_dataset(ds_index: DatasetV2 | None = None) DatasetV2[source]#

Return a tf.data.Dataset whose elements are data from labeled frames. :returns: A dataset whose elements are dictionaries with the loaded data associated

with a single LabeledFrame. Items will be converted to tensors. These are:
“image”: Tensors of shape (height, width, channels) containing the full

raw frame image. The dtype is determined by the input data.

“raw_image_size”: The image size when it was first read as a tf.int32

tensor of shape (3,) representing [height, width, channels]. This is useful for keeping track of absolute image coordinates if downstream processing modules resize, crop or pad the image.

“example_ind”: Index of the individual labeled frame within the labels

stored in the labels attribute of this reader.

“video_ind”: Index of the video within the Labels.videos list that the

labeled frame comes from. Tensor will be a scalar of dtype tf.int32.

“frame_ind”: Index of the frame within the video that the labeled frame

comes from. Tensor will be a scalar of dtype tf.int64.

“scale”: The relative scaling factor of each image dimension specified

as a tf.float32 tensor of shape (2,) representing the (x_scale, y_scale) of the example. This is always (1.0, 1.0) when the images are initially read, but may be modified downstream in order to keep track of scaling operations. This is especially important to keep track of changes to the aspect ratio of the image grid in order to properly map points to image coordinates.

“instances”: Tensor of shape (n_instances, n_nodes, 2) of dtype float32

containing all of the instances in the frame.

“skeleton_inds”: Tensor of shape (n_instances,) of dtype tf.int32 that

specifies the index of the skeleton used for each instance.

“track_inds”: Tensor of shape (n_instance,) of dtype tf.int32 that

specifies the index of the instance track identity. If not specified, in the labels, this is set to -1.

property max_height_and_width: Tuple[int, int]#

Return (height, width) that is the maximum of all videos.

property output_keys: List[str]#

Return the output keys that the dataset will produce.

property tracks: List[Track]#

Return the list of tracks that track_inds in examples match up with.

property videos: List[Video]#

Return the list of videos that video_ind in examples match up with.

class sleap.nn.data.providers.VideoReader(video: Video, example_indices: Sequence[int] | ndarray | None = None)[source]#

Data provider from a sleap.Video instance.

This class can generate `tf.data.Dataset`s from a video for use in data pipelines. Each element in the dataset will contain the image data from a single frame.

video#

The sleap.Video instance to generate data from.

Type:

sleap.io.video.Video

example_indices#

List or numpy array of ints with the frame indices to use when iterating over the video. Use this to specify subsets of the video to read. If not provided, the entire video will be read.

Type:

Sequence[int] | numpy.ndarray | None

video_ind#

Scalar index of video to keep with each example. Helpful when running inference across videos.

classmethod from_filepath(filename: str, example_indices: Sequence[int] | ndarray | None = None, **kwargs) VideoReader[source]#

Create a LabelsReader from a saved labels file.

Parameters:
  • filename – Path to a video file.

  • example_indices – List or numpy array of ints with the frame indices to use when iterating over the video. Use this to specify subsets of the video to read. If not provided, the entire video will be read.

  • **kwargs – Any other video keyword argument (e.g., grayscale, dataset).

Returns:

A VideoReader instance that can create a dataset for pipelining.

make_dataset() DatasetV2[source]#

Return a tf.data.Dataset whose elements are data from video frames.

Returns:

A dataset whose elements are dictionaries with the loaded data associated with a single video frame. Items will be converted to tensors. These are:

”image”: Tensors of shape (height, width, channels) containing the full

raw frame image.

”raw_image_size”: The image size when it was first read as a tf.int32

tensor of shape (3,) representing [height, width, channels]. This is useful for keeping track of absolute image coordinates if downstream processing modules resize, crop or pad the image.

”video_ind”: Index of the video (always 0). Can be used to index into

the videos attribute of the provider.

”frame_ind”: Index of the frame within the video that the frame comes

from. This is the same as the input index, but is also provided for convenience in downstream processing.

”scale”: The relative scaling factor of each image dimension specified

as a tf.float32 tensor of shape (2,) representing the (x_scale, y_scale) of the example. This is always (1.0, 1.0) when the images are initially read, but may be modified downstream in order to keep track of scaling operations. This is especially important to keep track of changes to the aspect ratio of the image grid in order to properly map points to image coordinates.

property max_height_and_width: Tuple[int, int]#

Return (height, width) that is the maximum of all videos.

property output_keys: List[str]#

Return the output keys that the dataset will produce.

property videos: List[Video]#

Return the list of videos that video_ind in examples match up with.