Grouping and Type Measures
Clustering quality is evaluated using two metrics. The first metrics
(Grouping precision, recall and F-score) computes the intrinsic
quality of the clusters in terms of their phonetic composition. This
score is equivalent to the purity and inverse purity scores used for
evaluating clustering. As the Matching score, it is computed over
pairs, but contrary to the Matching scores, it focusses on the covered
part of the corpus.
\[\begin{split}\textrm{Grouping precision} &= \sum_{t\in\textrm{types}(\textrm{flat}(P_{clus}))}
freq(t, P_{clus})
\frac{|\textrm{match}(t, P_{clus} \cap P_{goldclus})|}{|\textrm{match}(t, P_{clus})|} \\
\textrm{Grouping recall} &= \sum_{t\in\textrm{types}(\textrm{flat}(P_{goldclus}))}
freq(t, P_{goldclus})
\frac{|\textrm{match}(t, P_{clus} \cap P_{goldclus})|}{|\textrm{match}(t, P_{goldclus})|}\end{split}\]
where
\[\begin{split}P_{clus} &= \{\langle \langle i, j\rangle , \langle k, l \rangle\rangle
| &\exists c\in C_{disc},\langle i, j\rangle\in c \wedge \langle k, l\rangle\in c\} \\
P_{goldclus} &= \{\langle \langle i, j\rangle , \langle k, l \rangle\rangle
| &\exists c_1,c_2\in C_{disc}:\langle i, j\rangle\in c_1 \wedge \langle k, l\rangle\in c_2 \\
&& \wedge T_{i,j}=T_{k,l} \wedge [i,j] \cap [k,l] = \varnothing \}\end{split}\]
The second metrics (Type precision, recall and F-score) takes as the
gold cluster set the true lexicon and is therefore much more
demanding. Indeed, a system could have very pure clusters, but could
systematically missegment words. Since a discovered cluster could have
several transcriptions, we use all of them (rather than using some
kind of centroid).
\[\begin{split}\textrm{Type precision} &= \frac{|\textrm{types}(F_{disc}) \cap \textrm{types}(F_{goldLex})|}
{|\textrm{types}(F_{disc})|} \\
\textrm{Type recall} &= \frac{|\textrm{types}(F_{disc}) \cap \textrm{types}(F_{goldLex})|}
{|\textrm{types}(F_{goldLex})|} \\\end{split}\]
where
\(F_{disc}\): the set of discovered fragments,
\(F_{disc} = \{ f | f \in c , c \in C_{disc} \}\)
\(F_{goldLex}\): the set of fragments corresponding to the
corpus transcribed at the word level (gold transcription).