DictionaryMixin#
- class montreal_forced_aligner.dictionary.mixins.DictionaryMixin(oov_word='<unk>', silence_word='<eps>', optional_silence_phone='sil', oov_phone='spn', other_noise_phone=None, position_dependent_phones=False, num_silence_states=5, num_non_silence_states=3, shared_silence_phones=False, ignore_case=True, silence_probability=0.5, initial_silence_probability=0.5, final_silence_correction=None, final_non_silence_correction=None, punctuation=None, clitic_markers=None, compound_markers=None, quote_markers=None, word_break_markers=None, brackets=None, non_silence_phones=None, disambiguation_symbols=None, clitic_set=None, max_disambiguation_symbol=0, phone_set_type='UNKNOWN', preserve_suprasegmentals=False, base_phone_mapping=None, use_cutoff_model=False, cutoff_word='<cutoff>', **kwargs)[source]#
Bases:
object
Abstract class for MFA classes that use acoustic models
- Parameters:
oov_word (str) – What to label words not in the dictionary, defaults to
'<unk>'
position_dependent_phones (bool) – Specifies whether phones should be represented as dependent on their position in the word (beginning, middle or end), defaults to True
num_silence_states (int) – Number of states to use for silence phones, defaults to 5
num_non_silence_states (int) – Number of states to use for non-silence phones, defaults to 3
shared_silence_phones (bool) – Specify whether to share states across all silence phones, defaults to False
ignore_case (bool) – Flag for whether all items should be converted to lower case, defaults to True
silence_probability (float) – Probability of optional silences following words, defaults to 0.5
initial_silence_probability (float) – Probability of initial silence, defaults to 0.5
final_silence_correction (float) – Correction term on final silence, defaults to None
final_non_silence_correction (float) – Correction term on final non-silence, defaults to None
punctuation (str, optional) – Punctuation to use when parsing text
clitic_markers (str, optional) – Clitic markers to use when parsing text
compound_markers (str, optional) – Compound markers to use when parsing text
quote_markers (list[str], optional) – Quotation markers to use when parsing text
word_break_markers (list[str], optional) – Word break markers to use when parsing text
brackets (list[tuple[str, str], optional) – Character tuples to treat as full brackets around words
disambiguation_symbols (set[str]) – Set of disambiguation symbols
max_disambiguation_symbol (int) – Maximum number of disambiguation symbols required, defaults to 0
preserve_suprasegmentals (int) – Flag for whether to keep phones separated by tone and stress
base_phone_mapping (dict[str, str]) – Mapping between phone symbols to make them share a base root for decision trees
- property base_phones#
Grouped phones by base phone
- property dictionary_options#
Dictionary options
- property extra_questions_mapping#
Mapping of extra questions for the given phone set type
- get_base_phone(phone)[source]#
Get the base phone, either through stripping diacritics, tone, and/or stress
- property kaldi_grouped_phones#
Non silence phones in Kaldi format
- property kaldi_non_silence_phones#
Non silence phones in Kaldi format
- property kaldi_silence_phones#
Silence phones in Kaldi format
- property phone_mapping#
Mapping of phones to integer IDs
- property phones#
The set of all phones (silence and non-silence)
- property positional_non_silence_phones#
List of non-silence phones with positions
- property positional_silence_phones#
List of silence phones with positions
- property reversed_phone_mapping#
A mapping of integer ids to phones
- property silence_disambiguation_symbol#
Silence disambiguation symbol
- property silence_phones#
Silence phones
- property silence_symbols#
A colon-separated string of silence phone ids
- property specials_set#
Special words, like the
oov_word
silence_word
,<s>
, and</s>