The package takes as input the same format as in the Zero Resource Speech Challenge (zerospeech.com):
Class 1: wav1 on1 off1 wav2 on2 off2 Class 2: wav1 on3 off3 wav3 on4 off4 wav2 on5 off5
Note that each class should end with an empty line, including the last class of the file.
If you want to use other input formats, you need to edit the read_clusters method in tde/readers/disc_reader.py.
The package uses gold phone and words alignments to evaluate the inputs. The alignments are stored in tde/share.
The formats for the alignements is (without header):
filename1 on1 off1 symbol1 filename2 on2 off2 symbol2 ...
Where filename are the names of the wavs, and symbol are the words or phones.
To add your own language in the package, you need to add yourlang.phn and yourlang.wrd in tde/share and add the option in tde/eval.py (line 39)