NED and Coverage MeasureΒΆ

Many spoken term discovery systems incorporate a step whereby fragments of speech are realigned and compared. Matching quality measures the accuraty of this process. Here, we use the NED/Coverage metrics for evaluating that.

NED and Coverage are quick to compute and give a qualitative estimate of the matching step. NED is the Normalised Edit Distance; it is equal to zero when a pair of fragments have exactly the same transcription, and 1 when they differ in all phonemes. Coverage is the fraction of corpus that contain matching pairs that has been discovered.

\[\begin{split}\textrm{NED} &= \sum_{\langle x, y\rangle \in P_{disc}} \frac{\textrm{ned}(x, y)}{|P_{disc}|} \\ \textrm{Coverage} &= \frac{|\textrm{cover}(P_{disc})|}{|\textrm{cover}(P_{all})|}\end{split}\]


\[\begin{split}\textrm{ned}(\langle i, j \rangle, \langle k, l \rangle) &= \frac{\textrm{Levenshtein}(T_{i,j}, T_{k,l})}{\textrm{max}(j-i+1,k-l+1)} \\ \textrm{cover}(P) &= \bigcup_{\langle i, j \rangle \in \textrm{flat}(P)}[i, j] \\ \textrm{flat}(P) &= \{p|\exists q:\{p,q\}\in P\}\end{split}\]

with - \(P_{all}\): the set of all possible non overlapping matching

fragment pairs. \(P_{all}=\{ \{a,b \}\in F_{all} \times F_{all} | T_{a} = T_{b}, \neg \textrm{overlap}(a,b)\}\).

  • \(P_{disc}\): the set of non overlapping discovered pairs, \(P_{disc} = \{ \{a,b\} | a \in c, b \in c, \neg \textrm{overlap}(a,b), c \in C_{disc} \}\)

  • \(P_{disc^*}\): the set of pairwise substring completion of \(P_{disc}\), which mean that we compute all of the possible minimal path realignments of the two strings, and extract all of the substrings pairs along the path (e.g., for fragment pair