DubmTrainer#

class montreal_forced_aligner.ivector.trainer.DubmTrainer(num_iterations=4, num_gselect=30, subsample=5, num_frames=500000, num_gaussians=256, num_iterations_init=20, initial_gaussian_proportion=0.5, min_gaussian_weight=0.0001, remove_low_count_gaussians=True, **kwargs)[source]#

Bases: IvectorModelTrainingMixin

Trainer for diagonal universal background models

Parameters:
  • num_iterations (int) – Number of training iterations to perform, defaults to 4

  • num_gselect (int) – Number of Gaussian-selection indices to use while training

  • subsample (int) – Subsample factor for feature frames, defaults to 5

  • num_frames (int) – Number of frames to keep in memory for initialization, defaults to 500000

  • num_gaussians (int) – Number of gaussians to use for DUBM training, defaults to 256

  • num_iterations_init (int) – Number of iteration to use when initializing UBM, defaults to 20

  • initial_gaussian_proportion (float) – Proportion of total gaussians to use initially, defaults to 0.5

  • min_gaussian_weight (float) – Defaults to 0.0001

  • remove_low_count_gaussians (bool) – Flag for removing low count gaussians in the final round of training, defaults to True

See also

IvectorModelTrainingMixin

For base ivector training parameters

acc_global_stats()[source]#

Multiprocessing function that accumulates global GMM stats

See also

AccGlobalStatsFunction

Multiprocessing helper function for each job

DubmTrainer.acc_global_stats_arguments

Job method for generating arguments for the helper function

gmmbin/gmm-global-sum-accs.cc

Relevant Kaldi binary

train_diag_ubm.sh

Reference Kaldi script

acc_global_stats_arguments()[source]#

Generate Job arguments for AccGlobalStatsFunction

Returns:

Arguments for processing

Return type:

list[AccGlobalStatsArguments]

property dubm_options#

Options for DUBM training

property exported_model_path#

Temporary model path to save intermediate model

finalize_training()[source]#

Finalize DUBM training

gmm_gselect()[source]#

Multiprocessing function that stores Gaussian selection indices on disk

See also

GmmGselectFunction

Multiprocessing helper function for each job

DubmTrainer.gmm_gselect_arguments

Job method for generating arguments for the helper function

train_diag_ubm.sh

Reference Kaldi script

gmm_gselect_arguments()[source]#

Generate Job arguments for GmmGselectFunction

Returns:

Arguments for processing

Return type:

list[GmmGselectArguments]

property model_path#

Current iteration’s DUBM model path

property next_model_path#

Next iteration’s DUBM model path

train_iteration()[source]#

Run an iteration of UBM training

property train_type#

Training identifier