Pretrained models

Pretrained acoustic models

As part of using the Montreal Forced Aligner in our own research, we have trained acoustic models for a number of languages. If you would like to use them, please download them below. Please note the dictionary that they were trained with to see more information about the phone set. When using these with a pronunciation dictionary, the phone sets must be compatible. If the orthography of the language is transparent, it is likely that we have a G2P model that can be used to generate the necessary pronunciation dictionary.

Language Link Corpus Phone set
Arabic Not available yet GlobalPhone GlobalPhone
Bulgarian Bulgarian acoustic model GlobalPhone GlobalPhone
Croatian Croatian acoustic model GlobalPhone GlobalPhone
Czech Czech acoustic model GlobalPhone GlobalPhone
English English acoustic model LibriSpeech Arpabet
French (FR) French (FR) acoustic model GlobalPhone GlobalPhone
French (FR) French (Prosodylab) acoustic model GlobalPhone Prosodylab [1]
French (QC) French (QC) acoustic model Lab speech Prosodylab [1]
German German acoustic model GlobalPhone GlobalPhone
German German (Prosodylab) acoustic model GlobalPhone Prosodylab [3]
Hausa Hausa acoustic model GlobalPhone GlobalPhone
Japanese Not available yet GlobalPhone GlobalPhone
Korean Korean acoustic model GlobalPhone GlobalPhone
Mandarin Mandarin acoustic model GlobalPhone GlobalPhone
Polish Polish acoustic model GlobalPhone GlobalPhone
Portuguese Portuguese acoustic model GlobalPhone GlobalPhone
Russian Russian acoustic model GlobalPhone GlobalPhone
Swahili Swahili acoustic model GlobalPhone GlobalPhone
Swedish Swedish acoustic model GlobalPhone GlobalPhone
Tamil Not available yet GlobalPhone GlobalPhone
Thai Thai acoustic model GlobalPhone GlobalPhone
Turkish Turkish acoustic model GlobalPhone GlobalPhone
Ukrainian Ukrainian acoustic model GlobalPhone GlobalPhone
Vietnamese Vietnamese acoustic model GlobalPhone GlobalPhone
Wu Not available yet GlobalPhone GlobalPhone

Pretrained G2P models

Included with MFA is a separate tool to generate a dictionary from a preexisting model. This should be used if you’re aligning a dataset for which you have no pronunciation dictionary or the orthography is very transparent. We have pretrained models for several languages, which can be downloaded below. These models were generated using Phonetisaurus Phonetisaurus and the GlobalPhone dataset. This means that they will only work for transcriptions which use the same alphabet. Current language options are: Arabic, Bulgarian, Mandarin, Czech, Polish, Russian, Swahili, Ukrainian, and Vietnamese, with the following accuracies when trained on 90% of the data and tested on 10%:

Language Link Accuracy Orthography system Phone set
Arabic Arabic G2P model 95.4 Romanized [1] GlobalPhone
Bulgarian Bulgarian G2P model 97.3 Cyrillic alphabet GlobalPhone
Croatian Croatian G2P model 92.7 Latin alphabet GlobalPhone
Czech Czech G2P model 96.8 Latin alphabet GlobalPhone
French French G2P model 93.2 Latin alphabet GlobalPhone
French French (Prosodylab) G2P model [1] 95.2 Latin alphabet Prosodylab
German German G2P model 67.0 Latin alphabet GlobalPhone
German German (Prosodylab) G2P model [3] 94.1 Latin alphabet Prosodylab
Hausa Hausa G2P model 70.1 Latin alphabet GlobalPhone
Japanese Japanese G2P model 82.1 Romanized GlobalPhone
Korean Korean G2P model 89.5 Hangul GlobalPhone
Mandarin Mandarin Pinyin G2P model 99.9 Pinyin Pinyin phones
Mandarin Mandarin Character G2P model [4] 83.2 Hanzi Pinyin phones
Polish Polish G2P model 98.8 Latin alphabet GlobalPhone
Portuguese Portuguese G2P model 86.5 Latin alphabet GlobalPhone
Russian Russian G2P model 96.4 Cyrillic alphabet GlobalPhone
Spanish Spanish G2P model 94.0 Latin alphabet GlobalPhone
Swahili Swahili G2P model 99.9 Latin alphabet GlobalPhone
Swedish Swedish G2P model 83.3 Latin alphabet GlobalPhone
Thai Thai G2P model 71.7 Thai script GlobalPhone
Turkish Turkish G2P model 83.3 Latin alphabet GlobalPhone
Ukrainian Ukrainian G2P model 98.0 Cyrillic alphabet GlobalPhone
Vietnamese Vietnamese G2P model 98.2 Vietnamese alphabet GlobalPhone
Wu Wu G2P model [5] 77.5 Hanzi GlobalPhone
[1](1, 2, 3, 4) The ProsodyLab French dictionary is based on Lexique with substitutions for numbers and special characters. Note that Lexique is known to currently not work with the aligner, see the Github issue for more information and status.
[2]Please see the GlobalPhone documentation for how the romanization was done for Arabic.
[3](1, 2) The German dictionary used in training is available in the ProsodyLab dictionary repository. See for more information on the CELEX phone set for German and how it maps to other phonesets.
[4]The Mandarin character dictionary that served as the training data for this model was built by mapping between characters in .trl files and pinyin syllables in .rmn files in the GlobalPhone corpus.
[5]The Wu G2P model was trained a fairly small lexicon, so it likely does not have the coverage to be a robust model for most purposes. Please check carefully any resulting dictionaries, as they are likely to have missing syllables from from unknown symbols.