Grouping Measure¶

class
tde.measures.grouping.
Grouping
(disc, output_folder=None, njobs=1)[source]¶ Bases:
tde.measures.measures.Measure
Grouping measure
The grouping measures how pure the found clusters are, and is close to the ‘purity’ measure in clustering. See https://docs.syntheticlearner.net/tde/measures/index.html for a summary of all measures.
Input :param disc: Discovered Object, contains the discovered elements :param output_folder: string, path to the output folder :param njobs: Number of cpus to be used.
Output :param precision: Grouping Precision :param recall: Grouping Recall

property
precision
¶

property
recall
¶

get_gold_pairs
()[source]¶ Get all the gold pairs that can be created using the discovered intervals. The pairs are ordered by filename and onset.
Input :param intervals: a list of all the discovered intervals, with
their transcription
Output :param gold_pairs: a set of all the gold pairs created from the
discovered intervals
 Parameters
gold_types – all the types (ngram) that occur in gold_pairs

get_found_pairs
()[source]¶ Get all the pairs that were found. The pairs are ordered by filename and onset.
Input :param clusters: a dict of all the clusters found. the keys
are the clusters names, the values are a list of the intervals in this cluster
Output :param found_pairs: a set of all the discovered pairs

static
get_weights
(pairs)[source]¶ For each type get its weight
Input :params pairs: a set containing pairs of intervals, stored
as (filename, onset, offset, token_ngram, ngram), where token_ngram is the ngram with the timestamps of each of its phone, and ngram is just a tuple of all the phones
Output :return: weights, a dict that for each type (i.e. ngram)
gives its weight, which is computed as number_of_tokens(ngram)/total_number_of_seen_tokens counter, a dict that for each type (i.e. ngram) gives the number of tokens of this ngram in the pairs.

compute_grouping
()[source]¶ Compute the grouping by essentially counting the number of tokens of each type in three sets: the set of gold pairs, the set of found pairs, and the intersection of gold pairs and found pairs

property
fscore
¶

write_score
()¶

property