Pretrained models

There are a number of pretrained models for aligning and generating pronunciation dictionaries. The command for downloading these is mfa download <model_type> where model_type is one of acoustic, g2p, or dictionary.

Pretrained acoustic models

As part of using the Montreal Forced Aligner in our own research, we have trained acoustic models for a number of languages. If you would like to use them, please download them below. Please note the dictionary that they were trained with to see more information about the phone set. When using these with a pronunciation dictionary, the phone sets must be compatible. If the orthography of the language is transparent, it is likely that we have a G2P model that can be used to generate the necessary pronunciation dictionary.

Any of the following acoustic models can be downloaded with the command mfa download acoustic <language_id>. You can get a full list of the currently available acoustic models via mfa download acoustic. New models contributed by users will be periodically added. If you would like to contribute your trained models, please contact Michael McAuliffe at michael.e.mcauliffe@gmail.com.

Language Link Corpus Number of speakers Audio (hours) Phone set
Arabic Arabic acoustic model GlobalPhone 80 19.0 GlobalPhone
Bulgarian Bulgarian acoustic model GlobalPhone 79 21.4 GlobalPhone
Croatian Croatian acoustic model GlobalPhone 94 15.9 GlobalPhone
Czech Czech acoustic model GlobalPhone 102 31.7 GlobalPhone
English English acoustic model LibriSpeech 2484 982.3 Arpabet (stressed)
French (FR) French (FR) acoustic model GlobalPhone 100 26.9 GlobalPhone
French (FR) French (Prosodylab) acoustic model GlobalPhone 100 26.9 Prosodylab [1]
French (QC) French (QC) acoustic model Lab speech N/A N/A Prosodylab [1]
German German acoustic model GlobalPhone 77 18 GlobalPhone
German German (Prosodylab) acoustic model GlobalPhone 77 18 Prosodylab [3]
Hausa Hausa acoustic model GlobalPhone 103 8.7 GlobalPhone
Japanese Not available yet GlobalPhone 144 34 GlobalPhone
Korean Korean acoustic model GlobalPhone 101 20.8 GlobalPhone
Mandarin Mandarin acoustic model GlobalPhone 132 31.2 Pinyin phones [6]
Polish Polish acoustic model GlobalPhone 99 24.6 GlobalPhone
Portuguese Portuguese acoustic model GlobalPhone 101 26.3 GlobalPhone
Russian Russian acoustic model GlobalPhone 115 26.5 GlobalPhone
Spanish Spanish acoustic model GlobalPhone 102 22.1 GlobalPhone
Swahili Swahili acoustic model GlobalPhone 70 11.1 GlobalPhone
Swedish Swedish acoustic model GlobalPhone 98 21.7 GlobalPhone
Tamil Not available yet GlobalPhone N/A N/A GlobalPhone
Thai Thai acoustic model GlobalPhone 98 28.2 GlobalPhone
Turkish Turkish acoustic model GlobalPhone 100 17.1 GlobalPhone
Ukrainian Ukrainian acoustic model GlobalPhone 119 14.1 GlobalPhone
Vietnamese Vietnamese acoustic model GlobalPhone 129 19.7 GlobalPhone
Wu Not available yet GlobalPhone 41 9.3 GlobalPhone

Pretrained G2P models

Included with MFA is a separate tool to generate a dictionary from a preexisting model. This should be used if you’re aligning a dataset for which you have no pronunciation dictionary or the orthography is very transparent. We have pretrained models for several languages below.

Any of the following G2P models can be downloaded with the command mfa download g2p <language_id>. You can get a full list of the currently available G2P models via mfa download g2p. New models contributed by users will be periodically added. If you would like to contribute your trained models, please contact Michael McAuliffe at michael.e.mcauliffe@gmail.com.

These models were generated using the Pynini package on the GlobalPhone dataset. The implementation is based on that in the Sigmorphon 2020 G2P task baseline. This means that they will only work for transcriptions which use the same alphabet. Current language options are listed below, with the following accuracies when trained on 90% of the data and tested on 10%:

Language Link WER LER Orthography system Phone set
Arabic Arabic G2P model 28.45 7.42 Romanized [2] GlobalPhone
Bulgarian Bulgarian G2P model 3.08 0.38 Cyrillic alphabet GlobalPhone
Croatian Croatian G2P model 9.47 3.4 Latin alphabet GlobalPhone
Czech Czech G2P model 3.43 0.71 Latin alphabet GlobalPhone
English English G2P model 28.45 7.42 Latin alphabet Arpabet
French French G2P model 42.54 6.98 Latin alphabet GlobalPhone
French French (Lexique) G2P model 5.31 1.06 Latin alphabet Lexique
French French (Prosodylab) G2P model [1] 5.11 0.95 Latin alphabet Prosodylab
German German G2P model 36.16 7.84 Latin alphabet GlobalPhone
German German (Prosodylab) G2P model [3] 5.43 0.65 Latin alphabet Prosodylab
Hausa Hausa G2P model 32.54 7.19 Latin alphabet GlobalPhone
Japanese Japanese G2P model 17.45 7.17 Kanji and kana GlobalPhone
Korean Korean Hangul G2P model 11.85 1.38 Hangul GlobalPhone
Korean Korean Jamo G2P model 8.94 0.95 Jamo GlobalPhone
Mandarin Mandarin Pinyin G2P model 0.27 0.06 Pinyin Pinyin phones
Mandarin Mandarin Character G2P model [4] 23.81 11.2 Hanzi Pinyin phones [6]
Polish Polish G2P model 1.23 0.33 Latin alphabet GlobalPhone
Portuguese Portuguese G2P model 10.67 1.62 Latin alphabet GlobalPhone
Russian Russian G2P model 4.04 0.65 Cyrillic alphabet GlobalPhone
Spanish Spanish G2P model 17.93 3.02 Latin alphabet GlobalPhone
Swahili Swahili G2P model 0.09 0.02 Latin alphabet GlobalPhone
Swedish Swedish G2P model 18.75 3.14 Latin alphabet GlobalPhone
Thai Thai G2P model 27.62 7.48 Thai script GlobalPhone
Turkish Turkish G2P model 8.51 2.32 Latin alphabet GlobalPhone
Ukrainian Ukrainian G2P model 2.1 0.42 Cyrillic alphabet GlobalPhone
Vietnamese Vietnamese G2P model 14.91 3.46 Vietnamese alphabet GlobalPhone
Wu Wu G2P model [5] 31.19 13.04 Hanzi GlobalPhone
[1](1, 2, 3) The ProsodyLab French dictionary is based on Lexique with substitutions for numbers and special characters. Note that Lexique is known to currently not work with the aligner, see the Github issue for more information and status.
[2]Please see the GlobalPhone documentation for how the romanization was done for Arabic.
[3](1, 2) The German dictionary used in training is available in the ProsodyLab dictionary repository. See http://www.let.uu.nl/~Hugo.Quene/personal/phonchar.html for more information on the CELEX phone set for German and how it maps to other phonesets.
[4]The Mandarin character dictionary that served as the training data for this model was built by mapping between characters in .trl files and pinyin syllables in .rmn files in the GlobalPhone corpus.
[5]The Wu G2P model was trained a fairly small lexicon, so it likely does not have the coverage to be a robust model for most purposes. Please check carefully any resulting dictionaries, as they are likely to have missing syllables from from unknown symbols.
[6](1, 2) The phoneset for Mandarin was created by GlobalPhone by splitting Pinyin into onset, nucleus (any vowel sequence), and codas, and then associating the tone of the syllable onto the nucleus (i.e. “fang2” -> “f a2 ng” and “xiao4” -> “x iao4”

Available pronunciation dictionaries

Any of the following pronunciation dictionaries can be downloaded with the command mfa download dictionary <language_id>. You can get a full list of the currently available dictionaries via mfa download dictionary. New dictionaries contributed by users will be periodically added. If you would like to contribute your dictionaries, please contact Michael McAuliffe at michael.e.mcauliffe@gmail.com.

Language Link Orthography system Phone set
English English pronunciation dictionary Latin Arpabet (stressed)
French French Prosodylab dictionary Latin Prosodylab French
German German Prosodylab dictionary Latin Prosodylab German
Brazilian Portuguese FalaBrasil dictionary Latin  

Available language models

There are several places that contain pretrained language models that can be imported to MFA.

Source Language Link
GlobalPhone Various languages GlobalPhone language models
LibriSpeech English LibriSpeech language models
FalaBrasil Brazilian Portuguese FalaBrasil language models