wordseg.syllabification in python,
wordseg-syll in bash.
Estimates syllable boundaries on a text using the maximal onset principle.
This algorithm fully syllabifies a text from a list of onsets and vowels. Input text must be in orthographic form (with word separators only) or in phonemized form (with both word and phone separators). Output text has syllable separators added at estimated syllable boundaries. For exemples of vowels and onsets files, see the directory wordseg/data/syllabification.
Syllabifier(onsets, vowels, separator=<wordseg.separator.Separator object>, filling_vowel=False, log=<RootLogger root (WARNING)>)¶
Syllabify a text given in phonological or orthographic form
Syllabification errors can occur when the onsets and/or vowels are not adapted to the input text (see the tolerant parameter).
onsets (list) – The list of valid onsets in the text
vowels (list) – The list of vowels in the text
separator (Separator, optional) – Token separation in the text
silent (bool, optional) – When True, append a silent vowel to the end of words without vowel (the vowel is removed after processing so the text is unchanged). When False those words cannot be syllabified.
log (logging.Logger, optional) – Where to send log messages
ValueError – If onsets or vowels are empty or are not lists.
Read a vowel or onsets file as a list
syllabify(text, strip=False, tolerant=False)¶
Returns the text with syllable boundaries added
text (sequence) – The input text to be syllabified. Each element of the sequence is assumed to be a single and complete utterance in valid phonological form.
strip (bool, optional) – When True, removes the syllable boundary at the end of words.
tolerant (bool, optional) – When False (the default), the function raise a ValueError on the first utterance that have not been correctly syllabified. When True, ignore the failed utterances in output but issue a log warning instead.
The text with estimated syllable boundaries added. If tolerant
is True some utterances may be missing in the output.
ValueError – If an utterance has not been correctly syllabified . If separator.syllable is found in the text, or if onsets or vowels are empty.