API reference

fastabx.zerospeech_abx(item, root, *, max_size_group, max_x_across, speaker='within', context='within', distance='angular', frequency=50, feature_maker=<function load>, extension='.pt', seed=0)[source]

Compute the ABX similarly to the ZeroSpeech 2021 challenge.

On triphone or phoneme, described by an item file. Within or across speaker, and within context or ignoring context.

Parameters:
  • item (str | Path) – Path to the item file.

  • root (str | Path) – Path to the root directory containing either the features or the audio files.

  • max_size_group (int | None) – Maximum number of instances of A, B, or X in each Cell. Passed to the Subsampler of the Task. Set to 10 in the original ZeroSpeech ABX code. Disabled if set to None.

  • max_x_across (int | None) – In the “across” speaker mode, maximum number of X considered for given values of A and B. Passed to the Subsampler of the Task. Set to 5 in the original ZeroSpeech ABX code. Disabled if set to None.

  • speaker (Literal['within', 'across']) – The speaker mode, either “within” or “across”. Defaults to “within”.

  • context (Literal['within', 'any']) – The context mode, either “within” or “any”. Always use “within” with representations of triphones. Defaults to “within”.

  • distance (DistanceName) – The distance metric, “angular” (same as “cosine”), “euclidean”, “kl_symmetric” or “identical”. Defaults to “angular”.

  • frequency (int) – The feature frequency of the features / the output of the feature maker, in Hz. Defaults to 50 Hz.

  • feature_maker (Callable[[str | Path], Tensor]) – Function that takes a path and returns a torch.Tensor. Defaults to torch.load.

  • extension (str) – The filename extension of the files to process in root, default is “.pt”.

  • seed (int) – The random seed for the subsampling, default is 0.

Return type:

float

Standard classes and functions

Dataset

class fastabx.Dataset(labels, accessor)[source]

Simple interface to a dataset.

Parameters:
  • labels (DataFrame) – pl.DataFrame containing the labels of the datapoints.

  • accessor (InMemoryAccessor) – InMemoryAccessor to access the data.

classmethod from_csv(path, feature_columns, *, separator=',')[source]

Create a dataset from a CSV file.

Parameters:
  • path (str | Path) – Path to the CSV file containing both the labels and the features.

  • feature_columns (str | Collection[str]) – Column name or list of column names containing the features.

  • separator (str) – Separator used in the CSV file.

Return type:

Dataset

classmethod from_dataframe(df, feature_columns)[source]

Create a dataset from a DataFrame (polars or pandas).

Parameters:
  • df (SupportsInterchange) – DataFrame containing both the labels and the features.

  • feature_columns (str | Collection[str]) – Column name or list of column names containing the features.

Return type:

Dataset

classmethod from_item(item, root, frequency, *, feature_maker=<function load>, extension='.pt', file_col='#file', onset_col='onset', offset_col='offset')[source]

Create a dataset from an item file.

If you want to keep the Libri-Light bug to reproduce previous results, set the environment variable FASTABX_WITH_LIBRILIGHT_BUG=1.

Parameters:
  • item (str | Path) – Path to the item file.

  • root (str | Path) – Path to the root directory containing either the features or the audio files.

  • frequency (int) – The feature frequency of the features / the output of the feature maker, in Hz.

  • feature_maker (Callable[[str | Path], Tensor]) – Function that takes a path and returns a torch.Tensor. Defaults to torch.load.

  • extension (str) – The filename extension of the files to process in root, default is “.pt”.

  • file_col (str) – Column in the item file that contains the audio file names, default is “#file”.

  • onset_col (str) – Column in the item file that contains the onset times, default is “onset”.

  • offset_col (str) – Column in the item file that contains the offset times, default is “offset”.

Return type:

Dataset

classmethod from_item_and_units(item, units, frequency, *, audio_key='audio', units_key='units', separator=' ', file_col='#file', onset_col='onset', offset_col='offset')[source]

Create a dataset from an item file with the units all described in a single JSONL file.

Parameters:
  • item (str | Path) – Path to the item file.

  • units (str | Path) – Path to the JSONL file containing the units.

  • frequency (int) – The feature frequency, in Hz.

  • audio_key (str) – Key in the JSONL file that contains the audio file names, default is “audio”.

  • units_key (str) – Key in the JSONL file that contains the units, default is “units”.

  • separator (str) – Separator used in the units field, default is whitespace “ “.

  • file_col (str) – Column in the item file that contains the audio file names, default is “#file”.

  • onset_col (str) – Column in the item file that contains the onset times, default is “onset”.

  • offset_col (str) – Column in the item file that contains the offset times, default is “offset”.

Return type:

Dataset

classmethod from_item_with_times(item, features, times, *, file_col='#file', onset_col='onset', offset_col='offset')[source]

Create a dataset from an item file.

Use arrays containing the times associated to the features instead of a given frequency.

Parameters:
  • item (str | Path) – Path to the item file.

  • features (str | Path) – Path to the root directory containing either the features or the audio files.

  • times (str | Path) – Path to the root directory containing the times arrays.

  • file_col (str) – Column in the item file that contains the audio file names, default is “#file”.

  • onset_col (str) – Column in the item file that contains the onset times, default is “onset”.

  • offset_col (str) – Column in the item file that contains the offset times, default is “offset”.

Return type:

Dataset

classmethod from_numpy(features, labels)[source]

Create a dataset from the features (numpy array) and the labels (dictionary of sequences).

Parameters:
  • features (ArrayLike) – 2D array-like containing the features.

  • labels (Mapping[str, Sequence[object]] | SupportsInterchange) – Dictionary of sequences or DataFrame containing the labels.

Return type:

Dataset

Task

class fastabx.Task(dataset, *, on, by=None, across=None, subsampler=None)[source]

The ABX task class.

A Task builds all the Cell given on, by and across conditions. It can be subsampled to limit the number of cells.

Parameters:
  • dataset (Dataset) – The dataset containing the features and the labels.

  • on (str) – The on condition.

  • by (list[str] | None) – The list of by conditions.

  • across (list[str] | None) – The list of across conditions.

  • subsampler (Subsampler | None) – An optional subsampler to limit the number of cells and their sizes.

Subsample

class fastabx.Subsampler(max_size_group, max_x_across, seed=0)[source]

Subsample the ABX Task.

Each cell is limited to max_size_group items for A, B and X independently. When using “across” conditions, each group of (A, B) is limited to max_x_across possible values for X. Subsampling for one or more conditions can be disabled by setting the corresponding argument to None.

Parameters:
  • max_size_group (int | None) – Maximum number of instances of A, B, or X in each Cell. Set to 10 in the original ZeroSpeech ABX code. Disabled if set to None.

  • max_x_across (int | None) – In the “across” speaker mode, maximum number of X considered for given values of A and B. Set to 5 in the original ZeroSpeech ABX code. Disabled if set to None.

  • seed (int) – The random seed for the subsampling, default is 0.

Score

class fastabx.Score(task, distance_name, *, constraints=None)[source]

Compute the score of a Task using a given distance specified by distance_name.

Additional Constraints can be provided to restrict the possible triplets in each cell.

Parameters:
  • task (Task) – The Task to score.

  • distance_name (DistanceName) – Name of the distance, “angular” (same as “cosine”), “euclidean”, “kl”, “kl_symmetric” or “identical”. Defaults to “angular”.

  • constraints (Constraints | None) – Optional constraints to restrict the possible triplets.

collapse(*, levels=None, weighted=False)[source]

Collapse the scored cells into the final score.

Use either levels or weighted=True to collapse the scores.

Parameters:
  • levels (Sequence[tuple[str, ...] | str] | None) – List of levels to collapse. The order matters a lot.

  • weighted (bool) – Whether to collapse the scores using a mean weighted by the size of the cells.

Return type:

float

details(*, levels)[source]

Collapse the scored cells and return the final scores and sizes for each (A, B) pairs.

Parameters:

levels (Sequence[tuple[str, ...] | str] | None) – List of levels to collapse. The order matters a lot.

Return type:

DataFrame

write_csv(file)[source]

Write the results of all the cells to a CSV file.

Parameters:

file (str | Path) – Path to the output CSV file.

Return type:

None

Pooling

fastabx.pooling(dataset, pooling_name)[source]

Pool the Dataset using the pooling method given by pooling_name.

The pooled dataset is a new one, with data stored in memory. For simplicity, we iterate through the original dataset and apply pooling on each element.

Parameters:
  • dataset (Dataset) – The dataset to pool.

  • pooling_name (PoolingName) – The pooling method, either “mean” or “hamming”.

Return type:

PooledDataset

Advanced

Cell

class fastabx.cell.Cell(a, b, x, header, description, is_symmetric)[source]

Individual cell of the ABX task.

Cells are the unit of work for the ABX Task and Score. They are collections of triplets (A, B, X) that share the same values for the on, by and across conditions.

Parameters:
  • a (Batch) – Batch of A samples.

  • b (Batch) – Batch of B samples.

  • x (Batch) – Batch of X samples.

  • header (str) – Short string identifying the cell.

  • description (str) – Long string describing the cell.

  • is_symmetric (bool) – Whether or not the cell is symmetric (i.e., A and X are the same set).

property num_triplets: int[source]

Get the number of triplets in the cell.

property use_dtw: bool[source]

Whether or not to use the DTW when computing the distances for this cell.

We don’t need DTW if all samples in the cell have a time dimension of 1.

Distance

fastabx.distance.distance_on_cell(cell, distance)[source]

Compute the distance matrices between all A and X, and all B and X in the cell, for a given distance.

Parameters:
  • cell (Cell) – The cell to compute the distances on.

  • distance (Distance) – The distance function to use. It takes two tensors of shape (n1, s1, d) and (n2, s2, d) and returns a tensor of shape (n1, n2, s1, s2).

Return type:

tuple[Tensor, Tensor]

fastabx.distance.abx_on_cell(cell, distance, *, mask=None)[source]

Compute the ABX of a cell using the given distance.

Parameters:
  • cell (Cell) – The cell to compute the ABX on.

  • distance (Distance) – The distance function to use. It takes two tensors of shape (n1, s1, d) and (n2, s2, d) and returns a tensor of shape (n1, n2, s1, s2).

  • mask (Tensor | None) – Optional boolean mask of shape (nx, na, nb) to select which triplets to include in the score.

Return type:

Tensor

DTW

fastabx.dtw.dtw(distances)[source]

Compute the DTW of the given distances 2D tensor.

Parameters:

distances (Tensor) – A 2D tensor of shape (n, m) representing the pairwise distances between two sequences.

Return type:

Tensor

fastabx.dtw.dtw_batch(distances, sx, sy, *, symmetric)[source]

Compute the batched DTW on the distances 4D tensor.

Parameters:
  • distances (Tensor) – A 4D tensor of shape (n1, n2, s1, s2) representing the pairwise distances between two batches of sequences.

  • sx (Tensor) – A 1D tensor of shape (n1,) representing the lengths of the sequences in the first batch.

  • sy (Tensor) – A 1D tensor of shape (n2,) representing the lengths of the sequences in the second batch.

  • symmetric (bool) – Whether or not the DTW is symmetric (i.e., the two batches are the same).

Return type:

Tensor

Constraints

type fastabx.constraints.Constraints

Type alias for Iterable[pl.Expr | pl.Series | str].

Should be a valid input to pl.DataFrame.filter. See With constraints to understand how to use them.

fastabx.constraints.constraints_all_different(*columns)[source]

Return Constraints that ensure that each specified column has different values for A, B and X.

Parameters:

columns (str) – The columns to apply the constraints on.

Return type:

Constraints

Environment variables

  • FASTABX_WITH_LIBRILIGHT_BUG: If set to 1, changes the behaviour of Dataset.from_item to match Libri-Light. Every feature will now be one frame shorter. This should be set only if you want to replicate previous results obtained with Libri-Light / ZeroSpeech 2021. See Slicing features for more details on how features are sliced.