API Reference

Aligner API

There are two main aligner classes, one for using a pretrained model and one for training a model while aligning.

PretrainedAligner(corpus, dictionary, ...[, ...]) Class for aligning a dataset using a pretrained acoustic model
TrainableAligner(corpus, dictionary, ...[, ...]) Aligner that aligns and trains acoustics models on a large dataset

Corpus API

The Corpus class contains information about how a dataset is structured

Corpus(directory, output_directory[, ...]) Class that stores information about the dataset to align.

Dictionary API

Dictionary(input_path, output_directory[, ...]) Class containing information about a pronunciation dictionary

Model API

Output from training a model is compressed using the Archive class, which results in a zip folder.

AcousticModel(source[, is_tmpdir])
G2PModel(source[, is_tmpdir])

Multiprocessing API

The multiprocessing module contains most of the interactions with Kaldi, as multiple processes are used to speed up the set up and aligning of the dataset.

mfcc(mfcc_directory, log_directory, ...) Multiprocessing function that converts wav files into MFCCs
compile_train_graphs(directory, ...[, debug]) Multiprocessing function that compiles training graphs for utterances
mono_align_equal(mono_directory, ...) Multiprocessing function that creates equal alignments for base monophone training
align(iteration, directory, split_directory, ...) Multiprocessing function that aligns based on the current model
acc_stats(iteration, directory, ...[, fmllr]) Multiprocessing function that computes stats for GMM training
tree_stats(directory, align_directory, ...) Multiprocessing function that computes stats for decision tree training
calc_fmllr(directory, split_directory, ...) Multiprocessing function that computes speaker adaptation (fMLLR)
convert_alignments(directory, ...) Multiprocessing function that converts alignments from previous training
convert_ali_to_textgrids(output_directory, ...) Multiprocessing function that aligns based on the current model

Configuration API

These classes contain information about configuring data preparation and training.

MfccConfig(output_directory[, job, kwargs]) Class to store configuration information about MFCC generation
MonophoneConfig(**kwargs) Configuration class for monophone training
TriphoneConfig(**kwargs) Configuration class for triphone training
TriphoneFmllrConfig([align_often]) Configuration class for speaker-adapted triphone training