Align with an acoustic model (mfa align)#
This is the primary workflow of MFA, where you can use pretrained acoustic models to align your dataset. There are a number of MFA acoustic models to use, but you can also adapt a pretrained model to your data (see Adapt acoustic model to new data (mfa adapt)) or train an acoustic model from scratch using your dataset (see Train a new acoustic model (mfa train)).
See also
Evaluating alignments for details on how to evaluate alignments against a gold standard.
Fine-tuning alignments for implementation details on how alignments are fine tuned.
Phone model alignments for implementation details on using phone bigram models for generating alignments.
Command reference#
mfa align#
Align a corpus with a pronunciation dictionary and a pretrained acoustic model.
mfa align [OPTIONS] CORPUS_DIRECTORY DICTIONARY_PATH ACOUSTIC_MODEL_PATH
OUTPUT_DIRECTORY
Options
- -c, --config_path <config_path>#
Path to config file to use for training.
- -s, --speaker_characters <speaker_characters>#
Number of characters of file names to use for determining speaker, default is to use directory names.
- -a, --audio_directory <audio_directory>#
Audio directory root to use for finding audio files.
- --reference_directory <reference_directory>#
Directory containing gold standard alignments to evaluate
- --custom_mapping_path <custom_mapping_path>#
YAML file for mapping phones across phone sets in evaluations.
- --output_format <output_format>#
Format for aligned output files (default is long_textgrid).
- Options:
long_textgrid | short_textgrid | json | csv
- --include_original_text#
Flag to include original utterance text in the output.
- --fine_tune#
Flag for running extra fine tuning stage.
- -p, --profile <profile>#
Configuration profile to use, defaults to “global”
- -t, --temporary_directory <temporary_directory>#
Set the default temporary directory, default is /home/docs/Documents/MFA
- -j, --num_jobs <num_jobs>#
Set the number of processes to use by default, defaults to 3
- --clean, --no_clean#
Remove files from previous runs, default is False
- -v, --verbose, -nv, --no_verbose#
Output debug messages, default is False
- -q, --quiet, -nq, --no_quiet#
Suppress all output messages (overrides verbose), default is False
- --overwrite, --no_overwrite#
Overwrite output files when they exist, default is False
- --use_mp, --no_use_mp#
Turn on/off multiprocessing. Multiprocessing is recommended will allow for faster executions.
- -d, --debug, -nd, --no_debug#
Run extra steps for debugging issues, default is False
- --single_speaker#
Single speaker mode creates multiprocessing splits based on utterances rather than speakers.
- --textgrid_cleanup, --no_textgrid_cleanup#
Turn on/off post-processing of TextGrids that cleans up silences and recombines compound words and clitics.
- -h, --help#
Show this message and exit.
Arguments
- CORPUS_DIRECTORY#
Required argument
- DICTIONARY_PATH#
Required argument
- ACOUSTIC_MODEL_PATH#
Required argument
- OUTPUT_DIRECTORY#
Required argument