IvectorExtractorTrainer

class aligner.trainers.IvectorExtractorTrainer(default_feature_config)[source]

Configuration class for i-vector extractor training

Attributes:
ivector_dim : int

Dimension of the extracted i-vector

ivector_period : int

Number of frames between i-vector extractions

num_iters : int

Number of training iterations to perform

num_gselect : int

Gaussian-selection using diagonal model: number of Gaussians to select

posterior_scale : float

Scale on the acoustic posteriors, intended to account for inter-frame correlations

min_post : float

Minimum posterior to use (posteriors below this are pruned out)

subsample : int

Speeds up training; training on every x’th feature

max_count : int

The use of this option (e.g. –max-count 100) can make iVectors more consistent for different lengths of utterance, by scaling up the prior term when the data-count exceeds this value. The data-count is after posterior-scaling, so assuming the posterior-scale is 0.1, –max-count 100 starts having effect after 1000 frames, or 10 seconds of data.

Attributes

align_directory
align_log_directory
feature_file_base_name
final_gaussian_iteration
gaussian_increment
log_directory
meta
phone_type
train_directory
train_type

Methods

align(subset[, call_back])
compute_calculated_properties()
export_textgrids() Export a TextGrid file for every sound file in the dataset
get_unaligned_utterances()
init_training(identifier, …)
parse_log_directory(directory, iteration, …) Parse error files and relate relevant information about unaligned files
save(path) Output an acoustic model and dictionary to the specified path
train([call_back])
update(data)
save(path)[source]

Output an acoustic model and dictionary to the specified path

Parameters:
path : str

Path to save acoustic model and dictionary