API reference¶

fastabx.zerospeech_abx(item, root, *, max_size_group, max_x_across, speaker='within', context='within', distance='angular', frequency=50, feature_maker=<function load>, extension='.pt', seed=0)[source]¶

Compute the ABX similarly to the ZeroSpeech 2021 challenge.

On triphone or phoneme, described by an item file. Within or across speaker, and within context or ignoring context.

Parameters:

item (str | Path) – the item file.
root (str | Path) – the root directory containing either the features or the audio files.
max_size_group (int | None) – maximum number of instances of A, B, or X in each Cell. Passed to the Subsampler of the Task. Set to 10 in the original ZeroSpeech ABX code. Disabled if set to None.
max_x_across (int | None) – in the “across” speaker mode, maximum number of X considered for given values of A and B. Passed to the Subsampler of the Task. Set to 5 in the original ZeroSpeech ABX code. Disabled if set to None.
speaker (Literal['within', 'across']) – the speaker mode, either “within” or “across”. Defaults to “within”.
context (Literal['within', 'any']) – the context mode, either “within” or “any”. Always use “within” with representations of triphones. Defaults to “within”.
distance (DistanceName) – the distance metric, “angular” (same as “cosine”), “euclidean”, “kl_symmetric” or “identical”. Defaults to “angular”.
frequency (int) – the feature frequency of the features / the output of the feature maker, in Hz. Defaults to 50 Hz.
feature_maker (Callable[[str | Path], Tensor]) – the feature maker. Defaults to just loading the file with torch.load.
extension (str) – the filename extension of the files to process in root, default is “.pt”.
seed (int) – the random seed for the subsampling, default is 0.

Return type:

float

Standard classes and functions¶

Dataset¶

class fastabx.Dataset(labels, accessor)[source]¶

Simple interface to a dataset.

Parameters:

labels (DataFrame) – pl.DataFrame containing the labels of the datapoints.
accessor (InMemoryAccessor) – InMemoryAccessor to access the data.

classmethod from_csv(path, feature_columns, *, separator=',')[source]¶

Create a dataset from a CSV file.

Parameters:

path (str | Path)
feature_columns (str | Collection[str])
separator (str)

Return type:

Dataset

classmethod from_dataframe(df, feature_columns)[source]¶

Create a dataset from a DataFrame (polars or pandas).

Parameters:

df (SupportsInterchange)
feature_columns (str | Collection[str])

Return type:

Dataset

classmethod from_item(item, root, frequency, *, feature_maker=<function load>, extension='.pt')[source]¶

Create a dataset from an item file.

If you want to keep the Libri-Light bug to reproduce previous results, set the environment variable FASTABX_WITH_LIBRILIGHT_BUG=1.

Parameters:

item (str | Path)
root (str | Path)
frequency (int)
feature_maker (Callable[[str | Path], Tensor])
extension (str)

Return type:

Dataset

classmethod from_item_and_units(item, units, frequency, *, audio_key='audio', units_key='units', separator=' ')[source]¶

Create a dataset from an item file with the units all described in a single JSONL file.

Parameters:

item (str | Path) – Path to the item file.
units (str | Path) – Path to the JSONL file containing the units.
frequency (int) – Frequency of the features.
audio_key (str) – Key in the JSONL file that contains the audio file names.
units_key (str) – Key in the JSONL file that contains the units.
separator (str) – Separator used in the units field.

Return type:

Dataset

classmethod from_item_with_times(item, features, times)[source]¶

Create a dataset from an item file.

Use arrays containing the times associated to the features instead of a given frequency.

Parameters:

item (str | Path)
features (str | Path)
times (str | Path)

Return type:

Dataset

classmethod from_numpy(features, labels)[source]¶

Create a dataset from the features (numpy array) and the labels (dictionary of sequences).

Parameters:

features (ArrayLike)
labels (Mapping[str, Sequence[object]] | SupportsInterchange)

Return type:

Dataset

Task¶

class fastabx.Task(dataset, *, on, by=None, across=None, subsampler=None)[source]¶

The ABX task class.

A Task builds all the Cell given on, by and across conditions. It can be subsampled to limit the number of cells.

Parameters:

dataset (Dataset)
on (str)
by (list[str] | None)
across (list[str] | None)
subsampler (Subsampler | None)

Subsample¶

class fastabx.Subsampler(max_size_group, max_x_across, seed=0)[source]¶

Subsample the ABX Task.

Each cell is limited to max_size_group items for A, B and X independently. When using “across” conditions, each group of (A, B) is limited to max_x_across possible values for X. Subsampling for one or more conditions can be disabled by setting the corresponding argument to None.

Parameters:

max_size_group (int | None)
max_x_across (int | None)
seed (int)

Score¶

class fastabx.Score(task, distance_name)[source]¶

Compute the score of a Task using a given distance specified by distance_name.

Parameters:

task (Task)
distance_name (DistanceName)

collapse(*, levels=None, weighted=False)[source]¶

Collapse the scored cells into the final score.

Use either levels or weighted=True to collapse the scores.

Parameters:

levels (Sequence[tuple[str, ...] | str] | None) – List of levels to collapse. The order matters a lot.
weighted (bool) – Whether to collapse the scores using a mean weighted by the size of the cells.

Return type:

float

details(*, levels)[source]¶

Collapse the scored cells and return the final scores and sizes for each (A, B) pairs.

Parameters:: levels (Sequence[tuple[str, ...] | str] | None) – List of levels to collapse. The order matters a lot.
Return type:: DataFrame

write_csv(file)[source]¶

Write the results of all the cells to a CSV file.

Parameters:: file (str | Path)
Return type:: None

Pooling¶

fastabx.pooling(dataset, pooling_name)[source]¶

Pool the Dataset using the pooling method given by pooling_name.

The pooled dataset is a new one, with data stored in memory. For simplicity, we iterate through the original dataset and apply pooling on each element.

Parameters:

dataset (Dataset)
pooling_name (PoolingName)

Return type:

PooledDataset

Advanced¶

Cell¶

class fastabx.cell.Cell(a, b, x, header, description, is_symmetric)[source]¶

Individual cell of the ABX task.

Parameters:

a (Batch)
b (Batch)
x (Batch)
header (str)
description (str)
is_symmetric (bool)

property num_triplets: int[source]¶: Number of triplets in the cell.

property use_dtw: bool[source]¶: Whether or not to use the DTW when computing the distances for this cell.

Distance¶

fastabx.distance.distance_on_cell(cell, distance)[source]¶

Compute the distance matrices between all A and X, and all B and X in the cell, for a given distance.

Parameters:

cell (Cell)
distance (Distance)

Return type:

tuple[Tensor, Tensor]

fastabx.distance.abx_on_cell(cell, distance)[source]¶

Compute the ABX of a cell using the given distance.

Parameters:

cell (Cell)
distance (Distance)

Return type:

Tensor

DTW¶

fastabx.dtw.dtw(distances)[source]¶

Compute the DTW of the given distances 2D tensor.

Parameters:: distances (Tensor)
Return type:: Tensor

fastabx.dtw.dtw_batch(distances, sx, sy, *, symmetric)[source]¶

Compute the batched DTW on the distances 4D tensor.

Parameters:

distances (Tensor)
sx (Tensor)
sy (Tensor)
symmetric (bool)

Return type:

Tensor

Environment variables¶

FASTABX_WITH_LIBRILIGHT_BUG: If set to 1, changes the behaviour of Dataset.from_item to match Libri-Light. Every feature will now be one frame shorter. This should be set only if you want to replicate previous results obtained with Libri-Light / ZeroSpeech 2021. See Slicing features for more details on how features are sliced.