Training an ivector extractor

The Montreal Forced Aligner can train ivector extractors using an acoustic model for generating alignments. As part of this training process, a classifier is built in that can be used as part of Speaker classification.

Steps to train ivector extractor:

  1. Provided the steps in Installation have been completed and you are in the same Conda/virtual environment that MFA was installed in.
  2. Run the following command, substituting the arguments with your own paths:
mfa train_ivector corpus_directory dictionary_path acoustic_model_path output_model_path

Options available:

-h
--help

Display help message for the command

--config_path PATH

Path to a YAML config file that will specify the training configuration. See Ivector Configuration for more details.

-s NUMBER
--speaker_characters NUMBER

Number of characters to use to identify speakers; if not specified, the aligner assumes that the directory name is the identifier for the speaker. Additionally, it accepts the value prosodylab to use the second field of a _ delimited file name, following the convention of labelling production data in the ProsodyLab at McGill.

-t DIRECTORY
--temp_directory DIRECTORY

Temporary directory root to use for aligning, default is ~/Documents/MFA

-j NUMBER
--num_jobs NUMBER

Number of jobs to use; defaults to 3, set higher if you have more processors available and would like to process faster

-v
--verbose

The aligner will print out more information if present

-d
--debug

The aligner will run in debug mode

-c
--clean

Forces removal of temporary files in ~/Documents/MFA