ValidationMixin#

class montreal_forced_aligner.validation.ValidationMixin(ignore_acoustics=False, test_transcriptions=False, target_num_ngrams=100, order=3, method='kneser_ney', **kwargs)[source]#

Bases: object

Mixin class for performing validation on a corpus

Parameters:

ignore_acoustics (bool) – Flag for whether feature generation and training/alignment should be skipped
test_transcriptions (bool) – Flag for whether utterance transcriptions should be tested with a unigram language model
phone_alignment (bool) – Flag for whether alignments should be compared to a phone-based system
target_num_ngrams (int) – Target number of ngrams from speaker models to use

See also

CorpusAligner: For corpus, dictionary, and alignment parameters

analyze_files_with_no_transcription()[source]#: Analyzes issues with sound files that have no transcription files in the corpus and constructs message

analyze_missing_features()[source]#: Analyzes issues in feature generation in the corpus and constructs message

analyze_oovs()[source]#: Analyzes OOVs in the corpus and constructs message

analyze_setup()[source]#: Analyzes the setup process and outputs info to the console

analyze_textgrid_read_errors()[source]#: Analyzes issues with reading TextGrid files in the corpus and constructs message

analyze_transcriptions_with_no_wavs()[source]#: Analyzes issues with transcription that have no sound files in the corpus and constructs message

analyze_unreadable_text_files()[source]#: Analyzes issues with reading text files in the corpus and constructs message

analyze_wav_errors()[source]#: Analyzes any sound file issues in the corpus and constructs message

test_utterance_transcriptions()[source]#

Tests utterance transcriptions with simple unigram models based on the utterance text and frequent words in the corpus

Raises:: KaldiProcessingError – If there were any errors in running Kaldi binaries

property working_log_directory#: Working log directory