ValidationMixin#

class montreal_forced_aligner.validation.ValidationMixin(ignore_acoustics=False, test_transcriptions=False, target_num_ngrams=100, order=3, method='kneser_ney', **kwargs)[source]#

Bases: object

Mixin class for performing validation on a corpus

Parameters:
  • ignore_acoustics (bool) – Flag for whether feature generation and training/alignment should be skipped

  • test_transcriptions (bool) – Flag for whether utterance transcriptions should be tested with a unigram language model

  • phone_alignment (bool) – Flag for whether alignments should be compared to a phone-based system

  • target_num_ngrams (int) – Target number of ngrams from speaker models to use

See also

CorpusAligner

For corpus, dictionary, and alignment parameters

analyze_files_with_no_transcription()[source]#

Analyzes issues with sound files that have no transcription files in the corpus and constructs message

analyze_missing_features()[source]#

Analyzes issues in feature generation in the corpus and constructs message

analyze_oovs()[source]#

Analyzes OOVs in the corpus and constructs message

analyze_setup()[source]#

Analyzes the setup process and outputs info to the console

analyze_textgrid_read_errors()[source]#

Analyzes issues with reading TextGrid files in the corpus and constructs message

analyze_transcriptions_with_no_wavs()[source]#

Analyzes issues with transcription that have no sound files in the corpus and constructs message

analyze_unreadable_text_files()[source]#

Analyzes issues with reading text files in the corpus and constructs message

analyze_wav_errors()[source]#

Analyzes any sound file issues in the corpus and constructs message

test_utterance_transcriptions()[source]#

Tests utterance transcriptions with simple unigram models based on the utterance text and frequent words in the corpus

Raises:

KaldiProcessingError – If there were any errors in running Kaldi binaries

property working_log_directory#

Working log directory