Adapt acoustic model to new data (mfa adapt)#

A recent 2.0 functionality for MFA is to adapt pretrained acoustic models to a new dataset. MFA will first align the dataset using the pretrained model, and then update the acoustic model’s GMM means with those generated by the data. See train_map.sh for the Kaldi script this functionality corresponds to. As part of the adaptation process, MFA can generate final alignments and export these files if an output directory is specified in the command.

Command reference#

mfa adapt#

Adapt an acoustic model to a new corpus.

mfa adapt [OPTIONS] CORPUS_DIRECTORY DICTIONARY_PATH ACOUSTIC_MODEL_PATH
          OUTPUT_MODEL_PATH

Options

--output_directory <output_directory>#

Path to save alignments.

-c, --config_path <config_path>#

Path to config file to use for training.

-s, --speaker_characters <speaker_characters>#

Number of characters of file names to use for determining speaker, default is to use directory names.

-a, --audio_directory <audio_directory>#

Audio directory root to use for finding audio files.

--output_format <output_format>#

Format for aligned output files (default is long_textgrid).

Options:

long_textgrid | short_textgrid | json | csv

--include_original_text#

Flag to include original utterance text in the output.

-p, --profile <profile>#

Configuration profile to use, defaults to “global”

-t, --temporary_directory <temporary_directory>#

Set the default temporary directory, default is /home/docs/Documents/MFA

-j, --num_jobs <num_jobs>#

Set the number of processes to use by default, defaults to 3

--clean, --no_clean#

Remove files from previous runs, default is False

-v, --verbose, -nv, --no_verbose#

Output debug messages, default is False

-q, --quiet, -nq, --no_quiet#

Suppress all output messages (overrides verbose), default is False

--overwrite, --no_overwrite#

Overwrite output files when they exist, default is False

--use_mp, --no_use_mp#

Turn on/off multiprocessing. Multiprocessing is recommended will allow for faster executions.

--use_threading, --no_use_threading#

Use threading library rather than multiprocessing library. Multiprocessing is recommended will allow for faster executions.

-d, --debug, -nd, --no_debug#

Run extra steps for debugging issues, default is False

--use_postgres, --no_use_postgres#

Use postgres instead of sqlite for extra functionality, default is False

--single_speaker#

Single speaker mode creates multiprocessing splits based on utterances rather than speakers. This mode also disables speaker adaptation equivalent to --uses_speaker_adaptation false.

--textgrid_cleanup, --cleanup_textgrids, --no_textgrid_cleanup, --no_cleanup_textgrids#

Turn on/off post-processing of TextGrids that cleans up silences and recombines compound words and clitics.

-h, --help#

Show this message and exit.

Arguments

CORPUS_DIRECTORY#

Required argument

DICTIONARY_PATH#

Required argument

ACOUSTIC_MODEL_PATH#

Required argument

OUTPUT_MODEL_PATH#

Required argument

Configuration reference#

API reference#