Energy¶
Extraction of energy from audio signals
Computes the energy on window frames extracted from an audio
signal. This algorithm is identical to the first coefficient of
MfccProcessor
or
PlpProcessor
.
Examples
>>> from shennong.audio import Audio
>>> from shennong.features.processor.energy import EnergyProcessor
>>> audio = Audio.load('./test/data/test.wav')
Computes energy on the audio signal:
>>> proc = EnergyProcessor(sample_rate=audio.sample_rate)
>>> energy1 = proc.process(audio)
>>> energy1.shape
(140, 1)
By default the energy is log-compressed, you can desactivate compression available options for compression are ‘log’, ‘sqrt’ and ‘off’:
>>> proc.compression = 'off'
>>> energy2 = proc.process(audio)
>>> np.allclose(np.log(energy2.data), energy1.data, rtol=1)
True
The two energies above are not strictly identical because of dithering.
You can also fix the framing and windowing parameters:
>>> proc.frame_shift = 0.02
>>> proc.frame_length = 0.05
>>> proc.window_type = 'hanning'
>>> energy3 = proc.process(audio)
>>> energy3.shape
(69, 1)
-
class
shennong.features.processor.energy.
EnergyProcessor
(sample_rate=16000, frame_shift=0.01, frame_length=0.025, dither=1.0, preemph_coeff=0.97, remove_dc_offset=True, window_type='povey', round_to_power_of_two=True, blackman_coeff=0.42, snip_edges=True, raw_energy=True, compression='log')[source]¶ Bases:
shennong.features.processor.base.FramesProcessor
-
property
name
¶ Name of the processor
-
property
ndims
¶ Dimension of the output features frames
-
property
blackman_coeff
¶ Constant coefficient for generalized Blackman window
Used only if window_type is ‘blackman’
-
property
compression
¶ Type of energy compression
Must be ‘off’ (disable compression), ‘log’ (natural logarithm) or ‘sqrt’ (squared root).
-
property
dither
¶ Amount of dithering
0.0 means no dither
-
property
frame_length
¶ Frame length in seconds
-
property
frame_shift
¶ Frame shift in seconds
-
get_params
(deep=True)¶ Get parameters for this processor.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this processor and contained subobjects that are processors. Default to True.
- Returns
params (mapping of string to any) – Parameter names mapped to their values.
-
get_properties
()¶ Return the processors properties as a dictionary
-
property
preemph_coeff
¶ Coefficient for use in signal preemphasis
-
process_all
(signals, njobs=None)¶ Returns features processed from several input signals
This function processes the features in parallel jobs.
- Parameters
signals (dict of :class`~shennong.audio.Audio`) – A dictionnary of input audio signals to process features on, where the keys are item names and values are audio signals.
njobs (int, optional) – The number of parallel jobs to run in background. Default to the number of CPU cores available on the machine.
- Returns
features (
FeaturesCollection
) – The computed features on each input signal. The keys of output features are the keys of the input signals.- Raises
ValueError – If the njobs parameter is <= 0
-
property
remove_dc_offset
¶ If True, subtract mean from waveform on each frame
-
property
round_to_power_of_two
¶ If true, round window size to power of two
This is done by zero-padding input to FFT
-
property
sample_rate
¶ Waveform sample frequency in Hertz
Must match the sample rate of the signal specified in process
-
set_params
(**params)¶ Set the parameters of this processor.
- Returns
self
- Raises
ValueError – If any given parameter in
params
is invalid for the processor.
-
property
snip_edges
¶ If true, output only frames that completely fit in the file
When True the number of frames depends on the frame_length. If False, the number of frames depends only on the frame_shift, and we reflect the data at the ends.
-
property
window_type
¶ Type of window
Must be ‘hamming’, ‘hanning’, ‘povey’, ‘rectangular’ or ‘blackman’
-
property
raw_energy
¶ If true, compute energy before preemphasis and windowing
-
property