Gold object contains a vad, a word alignment and a phone alignmenet

Each alignement can be represented either as an interval tree or a dictionnary, depending on the usage (interval tree is fast for interval retrieval/ overal detection)

class tde.readers.gold_reader.Gold(vad_path=None, wrd_path=None, phn_path=None)[source]

Bases: object


Read the gold phoneme file with fields: speaker/file start end annotation

Returns a dict with the file/speaker as a key and the following structure:

gold[‘speaker’] = [{‘start’: list(…)}, {‘end’: list(…), ‘symbol’: list(…)}]

read_gold_intervalTree(gold_path, symbol_type=None)[source]

Read the gold alignment and build an interval tree (O( log(n) )). After that, take each found interval, search for its overlaps (O( log(n) + m), m being the number of results found), and check if we want to keep each interval.

  • gold (-) –

  • symbol_type (-) – if “word”, don’t keep the silences if some are found if “phone”, keep them and raise warning if none are found


  • - gold (a dict {fname: intervaltree} which returns the interval tree) – of the gold phones for each file

  • - ix2symbols (a dict that returns the symbols for each index of encoding) – (to compute the ned, we assign numbers to symbols)

  • ValueError

    • If the alignement is not well formated

  • UserWarning

    • If the phone alignement does not contain silences

  • AssertionError

    • If an interval contains an offset lower than the onset

get_intervals(on, off, gold, transcription)[source]

Given a filename and an interval, retrieve the list of covered intervals, and their transcription. This is done using, which is supposed to work in O(log(n) + m), n being the number of intervals and m the number of covered intervals.

  • fname (str, name of the speaker) –

  • on (float, onset of the interval) –

  • off (float, offset of the interval) –

  • gold (dict of intervaltree, contains all gold phones) –

  • transcription (dict of tuples, contains the transcription of each interval) –


Compute interval tree of silences