File Formats¶

Input Format¶

The package takes as input the same format as in the Zero Resource Speech Challenge (zerospeech.com):

Class 1:
wav1 on1 off1
wav2 on2 off2

Class 2:
wav1 on3 off3
wav3 on4 off4
wav2 on5 off5

The onset and offset times are expressed in seconds.

Note that each class must end with an empty line, including the last class of the file. So the file must be terminated by a blank line.

If you want to use other input formats, you need to edit the read_clusters method in tde/readers/disc_reader.py.

Alignments¶

The package uses gold phone and words alignments to evaluate the inputs. The alignments are stored in tde/share.

The formats for the alignements is (without header):

filename1 on1 off1 symbol1
filename2 on2 off2 symbol2
...

Where filename are the names of the wavs, and symbol are the words or phones.

To add your own language in the package, you need to add yourlang.phn and yourlang.wrd in tde/share and add the option in tde/eval.py (line 39).