ValidationMixin#

class montreal_forced_aligner.validation.ValidationMixin(ignore_acoustics=False, test_transcriptions=False, target_num_ngrams=100, order=3, method='kneser_ney', **kwargs)[source]#

Bases: object

Mixin class for performing validation on a corpus

Parameters:
  • ignore_acoustics (bool) – Flag for whether feature generation and training/alignment should be skipped

  • test_transcriptions (bool) – Flag for whether utterance transcriptions should be tested with a unigram language model

  • phone_alignment (bool) – Flag for whether alignments should be compared to a phone-based system

  • target_num_ngrams (int) – Target number of ngrams from speaker models to use

See also

CorpusAligner

For corpus, dictionary, and alignment parameters

analyze_files_with_no_transcription(output_directory=None)[source]#

Analyzes issues with sound files that have no transcription files in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_missing_features(output_directory=None)[source]#

Analyzes issues in feature generation in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_oovs(output_directory=None)[source]#

Analyzes OOVs in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_setup(output_directory=None)[source]#

Analyzes the setup process and outputs info to the console

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_textgrid_read_errors(output_directory=None)[source]#

Analyzes issues with reading TextGrid files in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_transcriptions_with_no_wavs(output_directory=None)[source]#

Analyzes issues with transcription that have no sound files in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_unreadable_text_files(output_directory=None)[source]#

Analyzes issues with reading text files in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

analyze_wav_errors(output_directory=None)[source]#

Analyzes any sound file issues in the corpus and constructs message

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

test_utterance_transcriptions(output_directory=None)[source]#

Tests utterance transcriptions with simple unigram models based on the utterance text and frequent words in the corpus

Parameters:

output_directory (Path, optional) – Optional directory to save output files in

Raises:

KaldiProcessingError – If there were any errors in running Kaldi binaries

property working_log_directory#

Working log directory