TokenizerValidator#

class montreal_forced_aligner.tokenization.tokenizer.TokenizerValidator(utterances_to_tokenize=None, **kwargs)[source]#

Bases: CorpusTokenizer

compute_validation_errors(gold_values, hypothesis_values)[source]#

Computes validation errors

Parameters:
property data_directory#

Data directory

property data_source_identifier#

Dummy “validation” data source

property evaluation_csv_path#

Path to working directory’s CSV file

setup()[source]#

Set up the pronunciation generator

tokenize_utterances()[source]#

Tokenize utterances

Returns:

Mappings of keys to their tokenized utterances

Return type:

dict[str, list[str]]