One hot encoding¶
One hot encoding of time-aligned tokens
One hot features are built from a time alignement of the spoken tokens. They come in two flavours:
OneHotProcessor
simply encode tokens in an alignment into on hot vectorsFramedOneHotProcessor
includes the alignment into windowed frames before doing the one hot encoding
Examples
Create a fake alignment:
>>> import numpy as np
>>> from shennong.alignment import Alignment
>>> alignment = Alignment(np.asarray([[0, 1], [1, 2]]), np.asarray(['a', 'b']))
>>> alignment
0 1 a
1 2 b
Extract onehot vectors from it:
>>> from shennong.processor.onehot import OneHotProcessor
>>> processor = OneHotProcessor()
>>> onehot = processor.process(alignment)
>>> onehot.times
array([[0, 1],
[1, 2]])
>>> onehot.data
array([[ True, False],
[False, True]])
-
class
shennong.processor.onehot.
OneHotProcessor
(tokens=None)[source]¶ Bases:
shennong.processor.onehot._OneHotBase
Simple version of one hot features encoding
The OneHotProcessor directly converts an
Alignment
tofeatures.Features
while preserving the timestamps of the original alignment.- Parameters
tokens (sequence, optional) – The tokens composing the alignment. Specify the tokens if you want to have consistant one-hot vectors accross different
Features
. By default the tokens are extracted from the alignment inprocess()
.
-
process
(alignment)[source]¶ Returns features processed from an input signal
- Parameters
signal (:class`~shennong.audio.Audio`) – The input audio signal to process features on
- Returns
features (
Features
) – The computed features
-
get_params
(deep=True)¶ Get parameters for this processor.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this processor and contained subobjects that are processors. Default to True.
- Returns
params (mapping of string to any) – Parameter names mapped to their values.
-
get_properties
(**kwargs)¶ Return the processors properties as a dictionary
-
property
log
¶ Processor logger
-
property
name
¶ Name of the processor
-
property
ndims
¶ Dimension of the output features frames
-
process_all
(utterances, njobs=None, **kwargs)¶ Returns features processed from several input utterances
This function processes the features in parallel jobs.
- Parameters
utterances (:class`~shennong.uttterances.Utterances`) – The utterances on which to process features on.
njobs (int, optional) – The number of parallel jobs to run in background. Default to the number of CPU cores available on the machine.
**kwargs (dict, optional) – Extra arguments to be forwarded to the process method. Keys must be the same as for utterances.
- Returns
features (
FeaturesCollection
) – The computed features on each input signal. The keys of output features are the keys of the input utterances.- Raises
ValueError – If the njobs parameter is <= 0 or if an entry is missing in optioanl kwargs.
-
set_logger
(level, formatter='%(levelname)s - %(name)s - %(message)s')¶ Change level and/or format of the processor’s logger
- Parameters
level (str) – The minimum log level handled by the logger (any message above this level will be ignored). Must be ‘debug’, ‘info’, ‘warning’ or ‘error’.
formatter (str, optional) – A string to format the log messages, see https://docs.python.org/3/library/logging.html#formatter-objects. By default display level and message. Use ‘%(asctime)s - %(levelname)s - %(name)s - %(message)s’ to display time, level, name and message.
-
set_params
(**params)¶ Set the parameters of this processor.
- Returns
self
- Raises
ValueError – If any given parameter in
params
is invalid for the processor.
-
property
tokens
¶
-
class
shennong.processor.onehot.
FramedOneHotProcessor
(tokens=None, sample_rate=16000, frame_shift=0.01, frame_length=0.025, window_type='povey', blackman_coeff=0.42)[source]¶ Bases:
shennong.processor.onehot._OneHotBase
One-hot encoding on framed signals
Computes the one-hot encoding on framed signals (i.e. on overlapping time windows)
- Parameters
tokens (sequence, optional) – The tokens composing the alignment. Specify the tokens if you want to have consistant one-hot vectors accross different
Features
. By default the tokens are extracted from the alignment inprocess()
.sample_rate (int, optional) – Sample frequency used for frames, in Hz, default to 16kHz
frame_shift (float, optional) – Frame shift in seconds, default to 10ms
frame_length (float, optional) – Frame length in seconds, default to 25ms
window_type ({'povey', 'hanning', 'hamming', 'rectangular', 'blackman'}) – The type of the window, default is ‘povey’ (like hamming but goes to zero at edges)
blackman_coeff (float, optional) – The constant coefficient for generalized Blackman window, used only when window_type is ‘blackman’, default is 0.42.
-
property
sample_rate
¶ The processor operation sample rate
Must match the sample rate of the signal specified in process
-
get_params
(deep=True)¶ Get parameters for this processor.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this processor and contained subobjects that are processors. Default to True.
- Returns
params (mapping of string to any) – Parameter names mapped to their values.
-
get_properties
(**kwargs)¶ Return the processors properties as a dictionary
-
property
log
¶ Processor logger
-
property
name
¶ Name of the processor
-
property
ndims
¶ Dimension of the output features frames
-
process_all
(utterances, njobs=None, **kwargs)¶ Returns features processed from several input utterances
This function processes the features in parallel jobs.
- Parameters
utterances (:class`~shennong.uttterances.Utterances`) – The utterances on which to process features on.
njobs (int, optional) – The number of parallel jobs to run in background. Default to the number of CPU cores available on the machine.
**kwargs (dict, optional) – Extra arguments to be forwarded to the process method. Keys must be the same as for utterances.
- Returns
features (
FeaturesCollection
) – The computed features on each input signal. The keys of output features are the keys of the input utterances.- Raises
ValueError – If the njobs parameter is <= 0 or if an entry is missing in optioanl kwargs.
-
set_logger
(level, formatter='%(levelname)s - %(name)s - %(message)s')¶ Change level and/or format of the processor’s logger
- Parameters
level (str) – The minimum log level handled by the logger (any message above this level will be ignored). Must be ‘debug’, ‘info’, ‘warning’ or ‘error’.
formatter (str, optional) – A string to format the log messages, see https://docs.python.org/3/library/logging.html#formatter-objects. By default display level and message. Use ‘%(asctime)s - %(levelname)s - %(name)s - %(message)s’ to display time, level, name and message.
-
set_params
(**params)¶ Set the parameters of this processor.
- Returns
self
- Raises
ValueError – If any given parameter in
params
is invalid for the processor.
-
property
tokens
¶
-
property
frame_shift
¶ Frame shift in seconds
-
property
frame_length
¶ Frame length in seconds