API Reference¶
Aligner API¶
There are two main aligner classes, one for using a pretrained model and one for training a model while aligning.
|
Class for aligning a dataset using a pretrained acoustic model |
|
Aligner that aligns and trains acoustics models on a large dataset |
Corpus API¶
The Corpus class contains information about how a dataset is structured
|
Class that stores information about the dataset to align. |
Dictionary API¶
|
Class containing information about a pronunciation dictionary |
Model API¶
Output from training a model is compressed using the Archive class, which results in a zip folder.
|
|
|
Multiprocessing API¶
The multiprocessing module contains most of the interactions with Kaldi, as multiple processes are used to speed up the set up and aligning of the dataset.
|
Multiprocessing function that converts wav files into MFCCs |
|
Multiprocessing function that compiles training graphs for utterances |
|
Multiprocessing function that creates equal alignments for base monophone training |
|
Multiprocessing function that aligns based on the current model |
|
Multiprocessing function that computes stats for GMM training |
|
Multiprocessing function that computes stats for decision tree training |
|
Multiprocessing function that computes speaker adaptation (fMLLR) |
|
Multiprocessing function that converts alignments from previous training |
|
Multiprocessing function that aligns based on the current model |
Configuration API¶
These classes contain information about configuring data preparation and training.
|
Class to store configuration information about MFCC generation |
|
Configuration class for monophone training |
|
Configuration class for triphone training |
|
Configuration class for speaker-adapted triphone training |