ValidationMixin#
- class montreal_forced_aligner.validation.ValidationMixin(ignore_acoustics=False, test_transcriptions=False, target_num_ngrams=100, order=3, method='kneser_ney', **kwargs)[source]#
Bases:
object
Mixin class for performing validation on a corpus
- Parameters:
ignore_acoustics (bool) – Flag for whether feature generation and training/alignment should be skipped
test_transcriptions (bool) – Flag for whether utterance transcriptions should be tested with a unigram language model
phone_alignment (bool) – Flag for whether alignments should be compared to a phone-based system
target_num_ngrams (int) – Target number of ngrams from speaker models to use
See also
CorpusAligner
For corpus, dictionary, and alignment parameters
- analyze_files_with_no_transcription(output_directory=None)[source]#
Analyzes issues with sound files that have no transcription files in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_missing_features(output_directory=None)[source]#
Analyzes issues in feature generation in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_oovs(output_directory=None)[source]#
Analyzes OOVs in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_setup(output_directory=None)[source]#
Analyzes the setup process and outputs info to the console
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_textgrid_read_errors(output_directory=None)[source]#
Analyzes issues with reading TextGrid files in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_transcriptions_with_no_wavs(output_directory=None)[source]#
Analyzes issues with transcription that have no sound files in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_unreadable_text_files(output_directory=None)[source]#
Analyzes issues with reading text files in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- analyze_wav_errors(output_directory=None)[source]#
Analyzes any sound file issues in the corpus and constructs message
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- test_utterance_transcriptions(output_directory=None)[source]#
Tests utterance transcriptions with simple unigram models based on the utterance text and frequent words in the corpus
- Parameters:
output_directory (Path, optional) – Optional directory to save output files in
- Raises:
KaldiProcessingError – If there were any errors in running Kaldi binaries
- property working_log_directory#
Working log directory