Pretrained models¶
Pretrained acoustic models¶
As part of using the Montreal Forced Aligner in our own research, we have trained acoustic models for a number of languages. If you would like to use them, please download them below. Please note the dictionary that they were trained with to see more information about the phone set. When using these with a pronunciation dictionary, the phone sets must be compatible. If the orthography of the language is transparent, it is likely that we have a G2P model that can be used to generate the necessary pronunciation dictionary.
Language |
Link |
Corpus |
Phone set |
---|---|---|---|
Arabic |
Not available yet |
GlobalPhone |
GlobalPhone |
Bulgarian |
GlobalPhone |
GlobalPhone |
|
Croatian |
GlobalPhone |
GlobalPhone |
|
Czech |
GlobalPhone |
GlobalPhone |
|
English |
LibriSpeech |
Arpabet |
|
French (FR) |
GlobalPhone |
GlobalPhone |
|
French (FR) |
GlobalPhone |
Prosodylab [1] |
|
French (QC) |
Lab speech |
Prosodylab [1] |
|
German |
GlobalPhone |
GlobalPhone |
|
German |
GlobalPhone |
Prosodylab [3] |
|
Hausa |
GlobalPhone |
GlobalPhone |
|
Japanese |
Not available yet |
GlobalPhone |
GlobalPhone |
Korean |
GlobalPhone |
GlobalPhone |
|
Mandarin |
GlobalPhone |
GlobalPhone |
|
Polish |
GlobalPhone |
GlobalPhone |
|
Portuguese |
GlobalPhone |
GlobalPhone |
|
Russian |
GlobalPhone |
GlobalPhone |
|
Swahili |
GlobalPhone |
GlobalPhone |
|
Swedish |
GlobalPhone |
GlobalPhone |
|
Tamil |
Not available yet |
GlobalPhone |
GlobalPhone |
Thai |
GlobalPhone |
GlobalPhone |
|
Turkish |
GlobalPhone |
GlobalPhone |
|
Ukrainian |
GlobalPhone |
GlobalPhone |
|
Vietnamese |
GlobalPhone |
GlobalPhone |
|
Wu |
Not available yet |
GlobalPhone |
GlobalPhone |
Pretrained G2P models¶
Included with MFA is a separate tool to generate a dictionary from a preexisting model. This should be used if you’re aligning a dataset for which you have no pronunciation dictionary or the orthography is very transparent. We have pretrained models for several languages, which can be downloaded below. These models were generated using Phonetisaurus Phonetisaurus and the GlobalPhone dataset. This means that they will only work for transcriptions which use the same alphabet. Current language options are: Arabic, Bulgarian, Mandarin, Czech, Polish, Russian, Swahili, Ukrainian, and Vietnamese, with the following accuracies when trained on 90% of the data and tested on 10%:
Language |
Link |
Accuracy |
Orthography system |
Phone set |
---|---|---|---|---|
Arabic |
95.4 |
Romanized [1] |
GlobalPhone |
|
Bulgarian |
97.3 |
Cyrillic alphabet |
GlobalPhone |
|
Croatian |
92.7 |
Latin alphabet |
GlobalPhone |
|
Czech |
96.8 |
Latin alphabet |
GlobalPhone |
|
French |
93.2 |
Latin alphabet |
GlobalPhone |
|
French |
95.2 |
Latin alphabet |
Prosodylab |
|
German |
67.0 |
Latin alphabet |
GlobalPhone |
|
German |
94.1 |
Latin alphabet |
Prosodylab |
|
Hausa |
70.1 |
Latin alphabet |
GlobalPhone |
|
Japanese |
82.1 |
Romanized |
GlobalPhone |
|
Korean |
89.5 |
Hangul |
GlobalPhone |
|
Mandarin |
99.9 |
Pinyin |
Pinyin phones |
|
Mandarin |
83.2 |
Hanzi |
Pinyin phones |
|
Polish |
98.8 |
Latin alphabet |
GlobalPhone |
|
Portuguese |
86.5 |
Latin alphabet |
GlobalPhone |
|
Russian |
96.4 |
Cyrillic alphabet |
GlobalPhone |
|
Spanish |
94.0 |
Latin alphabet |
GlobalPhone |
|
Swahili |
99.9 |
Latin alphabet |
GlobalPhone |
|
Swedish |
83.3 |
Latin alphabet |
GlobalPhone |
|
Thai |
71.7 |
Thai script |
GlobalPhone |
|
Turkish |
83.3 |
Latin alphabet |
GlobalPhone |
|
Ukrainian |
98.0 |
Cyrillic alphabet |
GlobalPhone |
|
Vietnamese |
98.2 |
Vietnamese alphabet |
GlobalPhone |
|
Wu |
77.5 |
Hanzi |
GlobalPhone |