NnetTrainer

class aligner.trainers.NnetTrainer(default_feature_config)[source]

Configuration class for neural network training

Attributes:
num_epochs : int

Number of epochs of training; number of iterations is worked out from this

iters_per_epoch : int

Number of iterations per epoch

realign_times : int

How many times to realign during training; this will equally space them over the iterations

beam : int

Default beam width for alignment

retry_beam : int

Beam width to fall back on if no alignment is produced

initial_learning_rate : float

The initial learning rate at the beginning of training

final_learning_rate : float

The final learning rate by the end of training

pnorm_input_dim : int

The input dimension of the pnorm component

pnorm_output_dim : int

The output dimension of the pnorm component

p : int

Pnorm parameter

hidden_layer_dim : int

Dimension of a hidden layer

samples_per_iter : int

Number of samples seen per job per each iteration; used when getting examples

shuffle_buffer_size : int

This “buffer_size” variable controls randomization of the samples on each iter. You could set it to 0 or to a large value for complete randomization, but this would both consume memory and cause spikes in disk I/O. Smaller is easier on disk and memory but less random. It’s not a huge deal though, as samples are anyway randomized right at the start. (the point of this is to get data in different minibatches on different iterations, since in the preconditioning method, 2 samples in the same minibatch can affect each others’ gradients.

add_layers_period : int

Number of iterations between addition of a new layer

num_hidden_layers : int

Number of hidden layers

randprune : float

Speeds up LDA

alpha : float

Relates to preconditioning

mix_up : int

Number of components to mix up to

prior_subset_size : int

Number of samples per job for computing priors

update_period : int

How often the preconditioning subspace is updated

num_samples_history : int

Relates to online preconditioning

preconditioning_rank_in : int

Relates to online preconditioning

preconditioning_rank_out : int

Relates to online preconditioning

Attributes

align_directory
align_log_directory
egs_directory
feature_file_base_name
final_gaussian_iteration
gaussian_increment
log_directory
meta
phone_type
train_directory
train_type

Methods

align(subset[, call_back])
compute_calculated_properties()
export_textgrids() Export a TextGrid file for every sound file in the dataset
get_unaligned_utterances()
init_training(identifier, …)
parse_log_directory(directory, iteration, …) Parse error files and relate relevant information about unaligned files
save(path) Output an acoustic model and dictionary to the specified path
train([call_back])
update(data)
save(path)[source]

Output an acoustic model and dictionary to the specified path

Parameters:
path : str

Path to save acoustic model and dictionary