Misc

sleap.util

A miscellaneous set of utility functions. Try not to put things in here unless they really have no other place.

sleap.util.attr_to_dtype(cls: Any)[source]

Converts classes with basic types to numpy composite dtypes.

Parameters

cls – class to convert

Returns

numpy dtype.

sleap.util.dict_cut(d: Dict, a: int, b: int) → Dict[source]

Helper function for creating subdictionary by numeric indexing of items.

Assumes that dict.items() will have a fixed order.

Parameters
  • d – The dictionary to “split”

  • a – Start index of range of items to include in result.

  • b – End index of range of items to include in result.

Returns

A dictionary that contains a subset of the items in the original dict.

sleap.util.find_files_by_suffix(root_dir: str, suffix: str, depth: int = 0) → List[nt.DirEntry][source]

Returns list of files matching suffix, optionally searching in subdirs.

Parameters
  • root_dir – Path to directory where we start searching

  • suffix – File suffix to match (e.g., ‘.json’)

  • depth – How many subdirectories deep to keep searching

Returns

List of os.DirEntry objects.

sleap.util.frame_list(frame_str: str) → Optional[List[int]][source]

Converts ‘n-m’ string to list of ints.

Parameters

frame_str – string representing range

Returns

List of ints, or None if string does not represent valid range.

sleap.util.get_config_file(shortname: str, ignore_file_not_found: bool = False, get_defaults: bool = False) → str[source]

Returns the full path to the specified config file.

The config file will be at ~/.sleap/<shortname>

If that file doesn’t yet exist, we’ll look for a <shortname> file inside the package config directory (sleap/config) and copy the file into the user’s config directory (creating the directory if needed).

Parameters
  • shortname – The short filename, e.g., shortcuts.yaml

  • ignore_file_not_found – If True, then return path for config file regardless of whether it exists.

  • get_defaults – If True, then just return the path to default config file.

Raises

FileNotFoundError – If the specified config file cannot be found.

Returns

The full path to the specified config file.

sleap.util.get_package_file(filename: str) → str[source]

Returns full path to specified file within sleap package.

sleap.util.json_dumps(d: Dict, filename: str = None)[source]

A simple wrapper around the JSON encoder we are using.

Parameters
  • d – The dict to write.

  • filename – The filename to write to.

Returns

None

sleap.util.json_loads(json_str: str) → Dict[source]

A simple wrapper around the JSON decoder we are using.

Parameters

json_str – JSON string to decode.

Returns

Result of decoding JSON string.

sleap.util.make_scoped_dictionary(flat_dict: Dict[str, Any], exclude_nones: bool = True) → Dict[str, Dict[str, Any]][source]

Converts dictionary with scoped keys to dictionary of dictionaries.

Parameters
  • flat_dict – The dictionary to convert. Keys should be strings with scope.foo format.

  • exclude_nodes – Whether to exclude items where value is None.

Returns

Dictionary in which keys are scope and values are dictionary with

foo (etc) as keys and original value of scope.foo as value.

sleap.util.save_dict_to_hdf5(h5file: h5py._hl.files.File, path: str, dic: dict)[source]

Saves dictionary to an HDF5 file.

Calls itself recursively if items in dictionary are not np.ndarray, np.int64, np.float64, str, or bytes. Objects must be iterable.

Parameters
  • h5file – The HDF5 filename object to save the data to. Assume it is open.

  • path – The path to group save the dict under.

  • dic – The dict to save.

Raises

ValueError – If type for item in dict cannot be saved.

Returns

None

sleap.util.uniquify(seq: Iterable[Hashable]) → List[source]

Returns unique elements from list, preserving order.

Note: This will not work on Python 3.5 or lower since dicts don’t preserve order.

Parameters

seq – The list to remove duplicates from.

Returns

The unique elements from the input list extracted in original order.

sleap.util.usable_cpu_count() → int[source]

Gets number of CPUs usable by the current process.

Takes into consideration cpusets restrictions.

Returns

The number of usable cpus

sleap.util.weak_filename_match(filename_a: str, filename_b: str) → bool[source]

Check if paths probably point to same file.

Compares the filename and names of two directories up.

Parameters
  • filename_a – first path to check

  • filename_b – path to check against first path

Returns

True if the paths probably match.

sleap.rangelist

Module with RangeList class for manipulating a list of range intervals.

This is used to cache the track occupancy so we can keep cache updating when user manipulates tracks for a range of instances.

class sleap.rangelist.RangeList(range_list: List[Tuple[int]] = None)[source]

Class for manipulating a list of range intervals. Each range interval in the list is a [start, end)-tuple.

add(val, tolerance=0)[source]

Add a single value, merges to last range if contiguous.

cut(cut: int)[source]

Return a pair of lists with everything before/after cut.

static cut_(range_list: List[Tuple[int]], cut: int)[source]

Return a pair of lists with everything before/after cut. :param range_list: the list to cut :param cut: the value at which to cut list

Returns

(pre-cut list, post-cut list)-tuple

cut_range(cut: tuple)[source]

Return three lists, everthing before/within/after cut range.

insert(new_range: tuple)[source]

Add a new range, merging to adjacent/overlapping ranges as appropriate.

insert_list(range_list: List[Tuple[int]])[source]

Add each range from a list of ranges.

property is_empty

Returns True if the list is empty.

classmethod join_(list_list: List[List[Tuple[int]]])[source]

Return a single list that includes all lists in input list.

Parameters

list_list – a list of range lists

Returns

range list that joins all of the lists in list_list

static join_pair_(list_a: List[Tuple[int]], list_b: List[Tuple[int]])[source]

Return a single pair of lists that joins two input lists.

property list

Returns the list of ranges.

remove(remove: tuple)[source]

Remove everything that overlaps with given range.

property start

Return the start value of range (or None if empty).

sleap.io.legacy

Module for legacy LEAP dataset.

sleap.io.legacy.load_labels_json_old(data_path: str, parsed_json: dict = None, adjust_matlab_indexing: bool = True, fix_rel_paths: bool = True) → List[sleap.instance.LabeledFrame][source]

Load predicted instances from Talmo’s old JSON format.

Parameters
  • data_path – The path to the JSON file.

  • parsed_json – The parsed json if already loaded, so we can save some time if already parsed.

  • adjust_matlab_indexing – Whether to adjust indexing from MATLAB.

  • fix_rel_paths – Whether to fix paths to videos to absolute paths.

Returns

A newly constructed Labels object.

sleap.io.legacy.load_predicted_labels_json_old(data_path: str, parsed_json: dict = None, adjust_matlab_indexing: bool = True, fix_rel_paths: bool = True) → List[sleap.instance.LabeledFrame][source]

Load predicted instances from Talmo’s old JSON format.

Parameters
  • data_path – The path to the JSON file.

  • parsed_json – The parsed json if already loaded, so we can save some time if already parsed.

  • adjust_matlab_indexing – Whether to adjust indexing from MATLAB.

  • fix_rel_paths – Whether to fix paths to videos to absolute paths.

Returns

List of LabeledFrame objects.

sleap.info.metrics

Module for producing prediction metrics for SLEAP datasets.

sleap.info.metrics.calculate_pairwise_cost(instances_a: List[sleap.instance.Instance], instances_b: List[sleap.instance.Instance], cost_function: Callable) → numpy.ndarray[source]

Calculate (a * b) matrix of pairwise costs using cost function.

sleap.info.metrics.compare_instance_lists(instances_a: List[sleap.instance.Instance], instances_b: List[sleap.instance.Instance]) → numpy.ndarray[source]

Given two lists of corresponding Instances, returns (instances * nodes) matrix of distances between corresponding nodes.

sleap.info.metrics.list_points_array(instances: List[sleap.instance.Instance]) → numpy.ndarray[source]

Given list of Instances, returns (instances * nodes * 2) matrix.

sleap.info.metrics.match_instance_lists(instances_a: List[sleap.instance.Instance], instances_b: List[sleap.instance.Instance], cost_function: Callable) → Tuple[List[sleap.instance.Instance], List[sleap.instance.Instance]][source]

Sorts two lists of Instances to find best overall correspondence for a given cost function (e.g., total distance between points).

sleap.info.metrics.match_instance_lists_nodewise(instances_a: List[sleap.instance.Instance], instances_b: List[sleap.instance.Instance], thresh: float = 5) → Tuple[List[sleap.instance.Instance], List[sleap.instance.Instance]][source]

For each node for each instance in the first list, pairs it with the closest corresponding node from any instance in the second list.

sleap.info.metrics.matched_instance_distances(labels_gt: sleap.io.dataset.Labels, labels_pr: sleap.io.dataset.Labels, match_lists_function: Callable, frame_range: Optional[range] = None) → Tuple[List[int], numpy.ndarray, numpy.ndarray, numpy.ndarray][source]

Distances between ground truth and predicted nodes over a set of frames.

Parameters
  • labels_gt – the Labels object with ground truth data

  • labels_pr – the Labels object with predicted data

  • match_lists_function – function for determining corresponding instances Takes two lists of instances and returns “sorted” lists.

  • frame_range (optional) – range of frames for which to compare data If None, we compare every frame in labels_gt with corresponding frame in labels_pr.

Returns

  • frame indices map: instance idx (for other matrices) -> frame idx

  • distance matrix: (instances * nodes)

  • ground truth points matrix: (instances * nodes * 2)

  • predicted points matrix: (instances * nodes * 2)

Return type

Tuple

sleap.info.metrics.nodeless_point_dist(inst_a: sleap.instance.Instance, inst_b: sleap.instance.Instance) → numpy.ndarray[source]

Given two instances, returns array of distances for closest points ignoring node identities.

sleap.info.metrics.point_dist(inst_a: sleap.instance.Instance, inst_b: sleap.instance.Instance) → numpy.ndarray[source]

Given two instances, returns array of distances for corresponding nodes.

sleap.info.metrics.point_match_count(dist_array: numpy.ndarray, thresh: float = 5) → int[source]

Given an array of distances, returns number which are <= threshold.

sleap.info.metrics.point_nonmatch_count(dist_array: numpy.ndarray, thresh: float = 5) → int[source]

Given an array of distances, returns number which are not <= threshold.

sleap.info.summary

Module for getting a series which gives some statistic based on labeling data for each frame of some labeled video.

class sleap.info.summary.StatisticSeries(labels: sleap.io.dataset.Labels)[source]

Class to calculate various statistical series for labeled frames.

Each method returns a series which is a dictionary in which keys are frame index and value are some numerical value for the frame.

Parameters

labels – The Labels for which to calculate series.

get_instance_score_series(video, reduction='sum') → Dict[int, float][source]

Get series with statistic of instance scores in each frame.

Parameters
  • video – The Video for which to calculate statistic.

  • reduction – name of function applied to scores: * sum * min

Returns

The series dictionary (see class docs for details)

get_point_count_series(video: sleap.io.video.Video) → Dict[int, float][source]

Get series with total number of labeled points in each frame.

get_point_displacement_series(video, reduction='sum') → Dict[int, float][source]

Get series with statistic of point displacement in each frame.

Point displacement is the distance between the point location in frame and the location of the corresponding point (same node, same track) from the closest earlier labeled frame.

Parameters
  • video – The Video for which to calculate statistic.

  • reduction – name of function applied to point scores: * sum * mean * max

Returns

The series dictionary (see class docs for details)

get_point_score_series(video: sleap.io.video.Video, reduction: str = 'sum') → Dict[int, float][source]

Get series with statistic of point scores in each frame.

Parameters
  • video – The Video for which to calculate statistic.

  • reduction – name of function applied to scores: * sum * min

Returns

The series dictionary (see class docs for details)

get_primary_point_displacement_series(video, reduction='sum', primary_node=None)[source]

Get sum of displacement for single node of each instance per frame.

Parameters
  • video – The Video for which to calculate statistic.

  • reduction – name of function applied to point scores: * sum * mean * max

  • primary_node – The node for which we’ll calculate displacement. This can be name of node or Node object. If not specified, then defaults to first node.

Returns

The series dictionary (see class docs for details)

sleap.info.write_tracking_h5

Generate an HDF5 file with track occupancy and point location data.

Ignores tracks that are entirely empty. By default will also ignore empty frames from the beginning and end of video, although –all-frames argument will make it include empty frames from beginning of video.

Call from command line as:

>>> python -m sleap.info.write_tracking_h5 <labels_filename>

Will write file to <labels_filename>.tracking.h5.

The HDF5 file has these datasets: * “track_occupancy” shape: tracks * frames * “tracks” shape: frames * nodes * 2 * tracks * “track_names” shape: tracks * “node_names” shape: nodes

Note: the datasets are stored column-major as expected by MATLAB.

sleap.info.write_tracking_h5.get_nodes_as_np_strings(labels: sleap.io.dataset.Labels) → List[numpy.bytes_][source]

Get list of node names as np.string_.

sleap.info.write_tracking_h5.get_occupancy_and_points_matrices(labels: sleap.io.dataset.Labels, all_frames: bool) → Tuple[numpy.ndarray, numpy.ndarray][source]

Builds numpy matrices with track occupancy and point location data.

Parameters
  • labels – The Labels from which to get data.

  • all_frames – If True, then includes zeros so that frame index will line up with columns in the output. Otherwise, there will only be columns for the frames between the first and last frames with labeling data.

Returns

  • occupancy matrix with shape (tracks, frames)

  • point location matrix with shape (frames, nodes, 2, tracks)

Return type

tuple of two matrices

sleap.info.write_tracking_h5.get_tracks_as_np_strings(labels: sleap.io.dataset.Labels) → List[numpy.bytes_][source]

Get list of track names as np.string_.

sleap.info.write_tracking_h5.main(labels: sleap.io.dataset.Labels, output_path: str, all_frames: bool = True)[source]

Writes HDF5 file with matrices of track occupancy and coordinates.

Parameters
  • labels – The Labels from which to get data.

  • output_path – Path of HDF5 file to create.

  • all_frames – If True, then includes zeros so that frame index will line up with columns in the output. Otherwise, there will only be columns for the frames between the first and last frames with labeling data.

Returns

None

sleap.info.write_tracking_h5.remove_empty_tracks_from_matrices(track_names: List, occupancy_matrix: numpy.ndarray, locations_matrix: numpy.ndarray) → Tuple[List, numpy.ndarray, numpy.ndarray][source]

Removes matrix rows/columns for unoccupied tracks.

Parameters
  • track_names – List of track names

  • occupancy_matrix – 2d numpy matrix, rows correspond to tracks

  • locations_matrix – 4d numpy matrix, last index is track

Returns

track_names, occupancy_matrix, locations_matrix from input, but without the rows/columns corresponding to unoccupied tracks.

sleap.info.write_tracking_h5.write_occupancy_file(output_path: str, data_dict: Dict[str, Any], transpose: bool = True)[source]

Write HDF5 file with data from given dictionary.

Parameters
  • output_path – Path of HDF5 file.

  • data_dict – Dictionary with data to save. Keys are dataset names, values are the data.

  • transpose – If True, then any ndarray in data dictionary will be transposed before saving. This is useful for writing files that will be imported into MATLAB, which expects data in column-major format.

Returns

None