SplitWordsFunction#
- class montreal_forced_aligner.dictionary.mixins.SplitWordsFunction(clitic_marker, initial_clitic_regex, final_clitic_regex, compound_regex, non_speech_regexes, oov_word=None, word_mapping=None, grapheme_mapping=None)[source]#
Bases:
objectClass for functions that splits words that have compound and clitic markers
- Parameters:
compound_markers (list[str]) – Characters that mark compound words
brackets (list[tuple[str, str], optional) – Character tuples to treat as full brackets around words
words_mapping (dict[str, int]) – Mapping of words to integer IDs
oov_word (str) – What to label words not in the dictionary, defaults to None