One hot encoding

One hot encoding of time-aligned tokens

Alignment —> {Framed}OneHotProcessor —> Features

One hot features are built from a time alignement of the spoken tokens. They come in two flavours:

Examples

Create a fake alignment:

>>> import numpy as np
>>> from shennong.alignment import Alignment
>>> alignment = Alignment(np.asarray([[0, 1], [1, 2]]), np.asarray(['a', 'b']))
>>> alignment
0 1 a
1 2 b

Extract onehot vectors from it:

>>> from shennong.features.processor.onehot import OneHotProcessor
>>> processor = OneHotProcessor()
>>> onehot = processor.process(alignment)
>>> onehot.times
array([[0, 1],
       [1, 2]])
>>> onehot.data
array([[ True, False],
       [False,  True]])
class shennong.features.processor.onehot.OneHotProcessor(tokens=None)[source]

Bases: shennong.features.processor.onehot._OneHotBase

Simple version of one hot features encoding

The OneHotProcessor directly converts an Alignment to features.Features while preserving the timestamps of the original alignment.

Parameters

tokens (sequence, optional) – The tokens composing the alignment. Specify the tokens if you want to have consistant one-hot vectors accross different Features. By default the tokens are extracted from the alignment in process().

process(alignment)[source]

Returns features processed from an input signal

Parameters

signal (:class`~shennong.audio.Audio`) – The input audio signal to process features on

Returns

features (Features) – The computed features

get_params(deep=True)

Get parameters for this processor.

Parameters

deep (boolean, optional) – If True, will return the parameters for this processor and contained subobjects that are processors. Default to True.

Returns

params (mapping of string to any) – Parameter names mapped to their values.

get_properties()

Return the processors properties as a dictionary

property name

Name of the processor

property ndims

Dimension of the output features frames

process_all(signals, njobs=None)

Returns features processed from several input signals

This function processes the features in parallel jobs.

Parameters
  • signals (dict of :class`~shennong.audio.Audio`) – A dictionnary of input audio signals to process features on, where the keys are item names and values are audio signals.

  • njobs (int, optional) – The number of parallel jobs to run in background. Default to the number of CPU cores available on the machine.

Returns

features (FeaturesCollection) – The computed features on each input signal. The keys of output features are the keys of the input signals.

Raises

ValueError – If the njobs parameter is <= 0

set_params(**params)

Set the parameters of this processor.

Returns

self

Raises

ValueError – If any given parameter in params is invalid for the processor.

property tokens
class shennong.features.processor.onehot.FramedOneHotProcessor(tokens=None, sample_rate=16000, frame_shift=0.01, frame_length=0.025, window_type='povey', blackman_coeff=0.42)[source]

Bases: shennong.features.processor.onehot._OneHotBase

One-hot encoding on framed signals

Computes the one-hot encoding on framed signals (i.e. on overlapping time windows)

Parameters
  • tokens (sequence, optional) – The tokens composing the alignment. Specify the tokens if you want to have consistant one-hot vectors accross different Features. By default the tokens are extracted from the alignment in process().

  • sample_rate (int, optional) – Sample frequency used for frames, in Hz, default to 16kHz

  • frame_shift (float, optional) – Frame shift in seconds, default to 10ms

  • frame_length (float, optional) – Frame length in seconds, default to 25ms

  • window_type ({'povey', 'hanning', 'hamming', 'rectangular', 'blackman'}) – The type of the window, default is ‘povey’ (like hamming but goes to zero at edges)

  • blackman_coeff (float, optional) – The constant coefficient for generalized Blackman window, used only when window_type is ‘blackman’, default is 0.42.

property sample_rate

The processor operation sample rate

Must match the sample rate of the signal specified in process

get_params(deep=True)

Get parameters for this processor.

Parameters

deep (boolean, optional) – If True, will return the parameters for this processor and contained subobjects that are processors. Default to True.

Returns

params (mapping of string to any) – Parameter names mapped to their values.

get_properties()

Return the processors properties as a dictionary

property name

Name of the processor

property ndims

Dimension of the output features frames

process_all(signals, njobs=None)

Returns features processed from several input signals

This function processes the features in parallel jobs.

Parameters
  • signals (dict of :class`~shennong.audio.Audio`) – A dictionnary of input audio signals to process features on, where the keys are item names and values are audio signals.

  • njobs (int, optional) – The number of parallel jobs to run in background. Default to the number of CPU cores available on the machine.

Returns

features (FeaturesCollection) – The computed features on each input signal. The keys of output features are the keys of the input signals.

Raises

ValueError – If the njobs parameter is <= 0

set_params(**params)

Set the parameters of this processor.

Returns

self

Raises

ValueError – If any given parameter in params is invalid for the processor.

property tokens
property frame_shift

Frame shift in seconds

property frame_length

Frame length in seconds

process(alignment)[source]

Returns features processed from an input signal

Parameters

signal (:class`~shennong.audio.Audio`) – The input audio signal to process features on

Returns

features (Features) – The computed features