API Reference

Aligner API

There are two main aligner classes, one for using a pretrained model and one for training a model while aligning.

PretrainedAligner(corpus, dictionary, ...[, ...])

Class for aligning a dataset using a pretrained acoustic model

TrainableAligner(corpus, dictionary, ...[, ...])

Aligner that aligns and trains acoustics models on a large dataset

Corpus API

The Corpus class contains information about how a dataset is structured

Corpus(directory, output_directory[, ...])

Class that stores information about the dataset to align.

Dictionary API

Dictionary(input_path, output_directory[, ...])

Class containing information about a pronunciation dictionary

Model API

Output from training a model is compressed using the Archive class, which results in a zip folder.

AcousticModel(source[, is_tmpdir])

G2PModel(source[, is_tmpdir])

Multiprocessing API

The multiprocessing module contains most of the interactions with Kaldi, as multiple processes are used to speed up the set up and aligning of the dataset.

mfcc(mfcc_directory, log_directory, ...)

Multiprocessing function that converts wav files into MFCCs

compile_train_graphs(directory, ...[, debug])

Multiprocessing function that compiles training graphs for utterances

mono_align_equal(mono_directory, ...)

Multiprocessing function that creates equal alignments for base monophone training

align(iteration, directory, split_directory, ...)

Multiprocessing function that aligns based on the current model

acc_stats(iteration, directory, ...[, fmllr])

Multiprocessing function that computes stats for GMM training

tree_stats(directory, align_directory, ...)

Multiprocessing function that computes stats for decision tree training

calc_fmllr(directory, split_directory, ...)

Multiprocessing function that computes speaker adaptation (fMLLR)

convert_alignments(directory, ...)

Multiprocessing function that converts alignments from previous training

convert_ali_to_textgrids(output_directory, ...)

Multiprocessing function that aligns based on the current model

Configuration API

These classes contain information about configuring data preparation and training.

MfccConfig(output_directory[, job, kwargs])

Class to store configuration information about MFCC generation

MonophoneConfig(**kwargs)

Configuration class for monophone training

TriphoneConfig(**kwargs)

Configuration class for triphone training

TriphoneFmllrConfig([align_often])

Configuration class for speaker-adapted triphone training