TranscriberMixin#

class montreal_forced_aligner.transcription.transcriber.TranscriberMixin(transition_scale=1.0, acoustic_scale=0.083333, self_loop_scale=0.1, beam=10, silence_weight=0.0, first_beam=10, first_max_active=2000, language_model_weight=10, word_insertion_penalty=0.5, evaluation_mode=False, **kwargs)[source]#

Bases: CorpusAligner

Abstract class for MFA transcribers

Parameters:

transition_scale (float) – Transition scale, defaults to 1.0
acoustic_scale (float) – Acoustic scale, defaults to 0.1
self_loop_scale (float) – Self-loop scale, defaults to 0.1
beam (int) – Size of the beam to use in decoding, defaults to 10
silence_weight (float) – Weight on silence in fMLLR estimation
max_active (int) – Max active for decoding
lattice_beam (int) – Beam width for decoding lattices
first_beam (int) – Beam for decoding in initial speaker-independent pass, only used if uses_speaker_adaptation is true
first_max_active (int) – Max active for decoding in initial speaker-independent pass, only used if uses_speaker_adaptation is true
language_model_weight (float) – Weight of language model
word_insertion_penalty (float) – Penalty for inserting words

calc_final_fmllr()[source]#

Calculate final fMLLR transforms

See also

FinalFmllrFunction: Multiprocessing function
TranscriberMixin.final_fmllr_arguments: Arguments for function

calc_initial_fmllr()[source]#

Calculate initial fMLLR transforms

See also

InitialFmllrFunction: Multiprocessing function
TranscriberMixin.initial_fmllr_arguments: Arguments for function

carpa_lm_rescore()[source]#

Rescore lattices with CARPA language model

See also

CarpaLmRescoreFunction: Multiprocessing function
TranscriberMixin.carpa_lm_rescore_arguments: Arguments for function

carpa_lm_rescore_arguments()[source]#

Generate Job arguments for CarpaLmRescoreFunction

Returns:: Arguments for processing
Return type:: list[CarpaLmRescoreArguments]

compute_wer()[source]#

Evaluates the transcripts if there are reference transcripts

Raises:: KaldiProcessingError – If there were any errors in running Kaldi binaries

decode()[source]#

Generate lattices

See also

DecodeFunction: Multiprocessing function
TranscriberMixin.decode_arguments: Arguments for function

decode_arguments(workflow=WorkflowType.transcription)[source]#

Generate Job arguments for DecodeFunction

Returns:: Arguments for processing
Return type:: list[DecodeArguments]

evaluate_transcriptions()[source]#

Evaluates the transcripts if there are reference transcripts

Returns:: Sentence error rate and word error rate
Return type:: float, float
Raises:: KaldiProcessingError – If there were any errors in running Kaldi binaries

final_fmllr_arguments()[source]#

Generate Job arguments for FinalFmllrFunction

Returns:: Arguments for processing
Return type:: list[FinalFmllrArguments]

fmllr_rescore()[source]#

Rescore lattices with final fMLLR transforms

See also

FmllrRescoreFunction: Multiprocessing function
TranscriberMixin.fmllr_rescore_arguments: Arguments for function

fmllr_rescore_arguments()[source]#

Generate Job arguments for FmllrRescoreFunction

Returns:: Arguments for processing
Return type:: list[FmllrRescoreArguments]

initial_fmllr_arguments()[source]#

Generate Job arguments for InitialFmllrFunction

Returns:: Arguments for processing
Return type:: list[InitialFmllrArguments]

lm_rescore()[source]#

Rescore lattices with bigger language model

See also

LmRescoreFunction: Multiprocessing function
TranscriberMixin.lm_rescore_arguments: Arguments for function

lm_rescore_arguments()[source]#

Generate Job arguments for LmRescoreFunction

Returns:: Arguments for processing
Return type:: list[LmRescoreArguments]

property lm_rescore_options#: Options needed for rescoring the language model

property model_directory#: Model directory for the transcriber

property model_log_directory#: Model directory for the transcriber

save_transcription_evaluation(output_directory)[source]#

Save transcription evaluation to an output directory

Parameters:: output_directory (str) – Directory to save evaluation

setup_phone_lm()[source]#: Setup phone language model for phone-based transcription

train_phone_lm()[source]#: Train a phone-based language model (i.e., not using words).

train_speaker_lm_arguments()[source]#

Generate Job arguments for TrainSpeakerLmFunction

Returns:: Arguments for processing
Return type:: list[TrainSpeakerLmArguments]

train_speaker_lms()[source]#: Train language models for each speaker based on their utterances

transcribe_fmllr()[source]#

Run fMLLR estimation over initial decoding lattices and rescore

See also

InitialFmllrFunction: Multiprocessing helper function for each job
LatGenFmllrFunction: Multiprocessing helper function for each job
FinalFmllrFunction: Multiprocessing helper function for each job
FmllrRescoreFunction: Multiprocessing helper function for each job
LmRescoreFunction: Multiprocessing helper function for each job
CarpaLmRescoreFunction: Multiprocessing helper function for each job

property transcribe_fmllr_options#: Options needed for calculating fMLLR transformations

transcribe_utterances()[source]#

Transcribe the corpus

See also

DecodeFunction: Multiprocessing helper function for each job
LmRescoreFunction: Multiprocessing helper function for each job
CarpaLmRescoreFunction: Multiprocessing helper function for each job

Raises:: KaldiProcessingError – If there were any errors in running Kaldi binaries