Configuration#
MFA root directory#
MFA uses a temporary directory for commands that can be specified in running commands with --temp_directory
(see below), and it also uses a directory to store global configuration settings and saved models. By default this root directory is ~/Documents/MFA
, but if you would like to put this somewhere else, you can set the environment variable MFA_ROOT_DIR
to use that. MFA will raise an error on load if it’s unable to write to the root directory.
Global configuration#
Global configuration for MFA can be updated via the mfa configure
subcommand. Once the command is called with a flag, it will set a default value for any future runs (though, you can overwrite most settings when you call other commands).
mfa configure#
The configure command is used to set global defaults for MFA so you don’t have to set them every time you call an MFA command.
mfa configure [OPTIONS]
Options
- -p, --profile <profile>#
Configuration profile to use, defaults to “global”
- -t, --temporary_directory <temporary_directory>#
Set the default temporary directory.Currently defaults to /home/docs/Documents/MFA
- -j, --num_jobs <num_jobs>#
Set the number of processes to use by default. Currently defaults to 3
- --always_clean, --never_clean#
Turn on/off clean mode where MFA will clean temporary files before each run. Currently defaults to False.
- --always_verbose, --never_verbose#
Turn on/off verbose mode where MFA will print more output. Currently defaults to False.
- --always_quiet, --never_quiet#
Turn on/off quiet mode where MFA will not print any output. Currently defaults to False.
- --always_debug, --never_debug#
Turn on/off extra debugging functionality. Currently defaults to False.
- --always_overwrite, --never_overwrite#
Turn on/off overwriting export files. Currently defaults to False.
- --enable_mp, --disable_mp#
Turn on/off multiprocessing. Multiprocessing is recommended will allow for faster executions. Currently defaults to True.
- --enable_textgrid_cleanup, --disable_textgrid_cleanup#
Turn on/off post-processing of TextGrids that cleans up silences and recombines compound words and clitics. Currently defaults to True.
- --enable_auto_server, --disable_auto_server#
If auto_server is enabled, MFA will start a server at the beginning of a command and close it at the end. If turned off, use the
mfa server
commands to initialize, start, and stop a profile’s server. Currently defaults to True.
- --enable_use_postgres, --disable_use_postgres#
If use_postgres is enabled, MFA will use PostgreSQL as the database backend instead of sqlite. Currently defaults to False.
- --blas_num_threads <blas_num_threads>#
Number of threads to use for BLAS libraries, 1 is recommended due to how much MFA relies on multiprocessing. Currently defaults to 1.
- --github_token <github_token>#
Github token to use for model downloading.
- --bytes_limit <bytes_limit>#
Bytes limit for Joblib Memory caching on disk.
- --seed <seed>#
Random seed to set for various pseudorandom processes.
- -h, --help#
Show this message and exit.
Configuring specific commands#
MFA has the ability to customize various parameters that control aspects of data processing and workflows. These can be supplied via the command line like:
mfa align ... --beam 1000
The above command will set the beam width used in aligning to 1000
(and the retry beam width to 4000). This command is the equivalent of supplying a config file like the below via the --config_path
:
beam: 1000
Supplying the above via:
mfa align ... --config_path config_above.yaml
will also set the beam width to 1000
and retry beam width to 4000
as well.
For simple settings, the command line argument approach can be good, but for more complex settings, the config yaml approach will allow you to specify things like aspects of training blocks or punctuation:
beam: 100
retry_beam: 400
punctuation: ":,."
training:
- monophone:
num_iterations: 20
max_gaussians: 500
subset: 1000
boost_silence: 1.25
- triphone:
num_iterations: 35
num_leaves: 2000
max_gaussians: 10000
cluster_threshold: -1
subset: 5000
boost_silence: 1.25
power: 0.25
You can then also override these options on the command like, i.e. --beam 10 --config_path config_above.yaml
would reset the beam width to 10
. Command line specified arguments always have higher priority over the parameters derived from a configuration yaml.