PyniniTrainerMixin#
- class montreal_forced_aligner.g2p.trainer.PyniniTrainerMixin(order=8, random_starts=25, delta=0.0009765625, alpha=1.0, batch_size=800, num_iterations=10, smoothing_method='kneser_ney', pruning_method='relative_entropy', model_size=1000000, prune_threshold=1e-07, insertions=True, deletions=True, fst_default_cache_gc='', fst_default_cache_gc_limit='', **kwargs)[source]#
Bases:
object
Mixin for training Pynini G2P models
- Parameters:
order (int) – Order of the ngram model, defaults to 7
random_starts (int) – Number of random starts to use in initialization, defaults to 25
seed (int) – Seed for randomization, defaults to 1917
delta (float) – Comparison/quantization delta for Baum-Welch training, defaults to 1/1024
alpha (float) – Step size reduction power parameter for Baum-Welch training; full standard batch EM is run (not stepwise) if set to 0, defaults to 1.0
batch_size (int) – Batch size for Baum-Welch training, defaults to 200
num_iterations (int) – Maximum number of iterations to use in Baum-Welch training, defaults to 10
smoothing_method (str) – Smoothing method for the ngram model, defaults to “kneser_ney”
pruning_method (str) – Pruning method for pruning the ngram model, defaults to “relative_entropy”
model_size (int) – Target number of ngrams for pruning, defaults to 1000000
insertions (bool) – Flag for whether to allow for insertions, default True
deletions (bool) – Flag for whether to allow for deletions, default True
fst_default_cache_gc (str) – String to pass to OpenFst binaries for GC behavior
fst_default_cache_gc_limit (str) – String to pass to OpenFst binaries for GC behavior
- property afst_path#
Path to store aligned FSTs
- property align_path#
Path to store alignment models
- property architecture#
Pynini
- property cg_path#
Path to covering grammar FST
- property data_source_identifier#
Dictionary name
- property encoder_path#
Internal temporary encoder file
- property far_path#
Internal temporary FAR file
- property fst_path#
Internal temporary FST file
- property input_far_path#
Path to store grapheme archive
- property input_path#
Path to temporary file to store grapheme training data
- property output_far_path#
Path to store phone archive
- property output_path#
Path to temporary file to store phone training data
- property sym_path#
Internal temporary symbol file