LmCorpusTrainerMixin#

class montreal_forced_aligner.language_modeling.trainer.LmCorpusTrainerMixin(**kwargs)[source]#

Bases: LmTrainerMixin, TextCorpusMixin

Top-level worker to train a language model from a text corpus

Parameters:
  • order (int) – Ngram order, defaults to 3

  • method (str) – Smoothing method for the ngram model, defaults to “kneser_ney”

  • count_threshold (int) – Minimum count needed to not be treated as an OOV item, defaults to 1

See also

LmTrainerMixin

For language model training parsing parameters

TextCorpusMixin

For corpus parsing parameters

TopLevelMfaWorker

For top-level parameters

property cnts_path#

Internal path to counts file

evaluate()[source]#

Run an evaluation over the training data to generate perplexity score

property far_path#

Internal path to FAR file

property meta#

Metadata information for the language model

property sym_path#

Internal path to symbols file

train()[source]#

Train a language model

train_large_lm()[source]#

Train a large language model

property training_path#

Internal path to training data