Align with an acoustic model `(mfa align)`#

This is the primary workflow of MFA, where you can use pretrained acoustic models to align your dataset. There are a number of MFA acoustic models to use, but you can also adapt a pretrained model to your data (see Adapt acoustic model to new data (mfa adapt)) or train an acoustic model from scratch using your dataset (see Train a new acoustic model (mfa train)).

See also

Evaluating alignments for details on how to evaluate alignments against a gold standard.
Fine-tuning alignments for implementation details on how alignments are fine tuned.
Phone model alignments for implementation details on using phone bigram models for generating alignments.
Analyzing alignment quality for details on the fields generated in the alignment_analysis.csv file in the output folder

Command reference#

mfa align#

Align a corpus with a pronunciation dictionary and a pretrained acoustic model.

mfa align [OPTIONS] CORPUS_DIRECTORY DICTIONARY_PATH ACOUSTIC_MODEL_PATH
          OUTPUT_DIRECTORY

Options

-c, --config_path <config_path>#: Path to config file to use for aligning.

-s, --speaker_characters <speaker_characters>#: Number of characters of file names to use for determining speaker, default is to use directory names.

-a, --audio_directory <audio_directory>#: Audio directory root to use for finding audio files.

--reference_directory <reference_directory>#: Directory containing gold standard alignments to evaluate

--custom_mapping_path <custom_mapping_path>#: YAML file for mapping phones across phone sets in evaluations.

--output_format <output_format>#

Format for aligned output files (default is long_textgrid).

Options:: long_textgrid | short_textgrid | json | csv

--include_original_text#: Flag to include original utterance text in the output.

--fine_tune#: Flag for running extra fine tuning stage.

-p, --profile <profile>#: Configuration profile to use, defaults to “global”

-t, --temporary_directory <temporary_directory>#: Set the default temporary directory, default is /home/docs/Documents/MFA

-j, --num_jobs <num_jobs>#: Set the number of processes to use by default, defaults to 3

--clean, --no_clean#: Remove files from previous runs, default is False

-v, --verbose, -nv, --no_verbose#: Output debug messages, default is False

-q, --quiet, -nq, --no_quiet#: Suppress all output messages (overrides verbose), default is False

--overwrite, --no_overwrite#: Overwrite output files when they exist, default is False

--use_mp, --no_use_mp#: Turn on/off multiprocessing. Multiprocessing is recommended will allow for faster executions.

-d, --debug, -nd, --no_debug#: Run extra steps for debugging issues, default is False

--use_postgres, --no_use_postgres#: Use postgres instead of sqlite for extra functionality, default is False

--single_speaker#: Single speaker mode creates multiprocessing splits based on utterances rather than speakers.

--textgrid_cleanup, --no_textgrid_cleanup#: Turn on/off post-processing of TextGrids that cleans up silences and recombines compound words and clitics.

-h, --help#: Show this message and exit.

Arguments

CORPUS_DIRECTORY#: Required argument

DICTIONARY_PATH#: Required argument

ACOUSTIC_MODEL_PATH#: Required argument

OUTPUT_DIRECTORY#: Required argument

Configuration reference#

Global Options

API reference#

Alignment