PhonetisaurusTrainerMixin#

class montreal_forced_aligner.g2p.phonetisaurus_trainer.PhonetisaurusTrainerMixin(order=8, batch_size=1000, num_iterations=10, smoothing_method='kneser_ney', pruning_method='relative_entropy', model_size=1000000, initial_prune_threshold=0.0001, insertions=True, deletions=True, restrict_m2m=False, penalize_em=False, penalize=False, sequence_separator='|', skip='_', alignment_separator=';', grapheme_order=2, phone_order=2, em_threshold=1e-05, **kwargs)[source]#

Bases: object

Mixin class for training Phonetisaurus style models

Parameters:
  • order (int) – Order of the ngram model, defaults to 8

  • batch_size (int) – Batch size for training, defaults to 1000

  • num_iterations (int) – Maximum number of iterations to use in Baum-Welch training, defaults to 10

  • smoothing_method (str) – Smoothing method for the ngram model, defaults to “kneser_ney”

  • pruning_method (str) – Pruning method for pruning the ngram model, defaults to “relative_entropy”

  • model_size (int) – Target number of ngrams for pruning, defaults to 1000000

  • initial_prune_threshold (float) – Pruning threshold for calculating the multiple phone/grapheme strings that are to be allowed, defaults to 0.0001

  • insertions (bool) – Flag for whether to allow for insertions, default True

  • deletions (bool) – Flag for whether to allow for deletions, default True

  • restrict_m2m (bool) – Flag for whether to restrict possible alignments to one-to-many and disable many-to-many alignments, default False

  • penalize_em (bool) – Flag for whether to many-to-many and one-to-many are penalized over one-to-one mappings during training, default False

  • penalize (bool) – Flag for whether to many-to-many and one-to-many are penalized over one-to-one mappings during export, default False

  • sequence_separator (str) – Character to use for concatenating and aligning multiple phones or graphemes, defaults to “|”

  • skip (str) – Character to use to represent deletions or insertions, defaults to “_”

  • alignment_separator (str) – Character to use for concatenating grapheme strings and phone strings, defaults to “;”

  • grapheme_order (int) – Maximum number of graphemes to map to single phones

  • phone_order (int) – Maximum number of phones to map to single graphemes

  • em_threshold (float) – Threshold of minimum change for early stopping of EM training

alignment_init_function#

alias of AlignmentInitWorker

property alignment_model_path#

Path to store alignment model FST

property alignment_symbols_path#

Path to alignment symbol table

property architecture#

Phonetisaurus

property data_directory#

Data directory for trainer

property data_source_identifier#

Dictionary name

expectation()[source]#

Run the expectation step for training

export_alignments()[source]#

Combine alignment training archives to a final combined FST archive to train the ngram model

export_model(output_model_path)[source]#

Export G2P model to specified path

Parameters:

output_model_path (Path) – Path to export model

property far_path#

Path to store final aligned FSTs

property fst_path#

Path to store final trained model

property grapheme_symbols_path#

Path to final model’s grapheme symbol table

initialize_alignments()[source]#

Initialize alignment FSTs for training

maximization(last_iteration=False)[source]#

Run the maximization step for training

Returns:

Current iteration’s score

Return type:

float

property ngram_path#

Path to store ngram model

property phone_symbols_path#

Path to final model’s phone symbol table

train()[source]#

Train a G2P model

train_alignments()[source]#

Run an Expectation-Maximization (EM) training on alignment FSTs to generate well-aligned FSTs for ngram modeling

train_iteration()[source]#

Train iteration, not used

train_ngram_model()[source]#

Train an ngram model on the aligned FSTs