Database#
MFA uses a SQLite database to cache information during training/alignment runs. An issue with training larger corpora was running into memory bottlenecks as all the information in the corpus was stored in memory, and fMLLR estimations in later stages would crash. Additionally, there was always a trade off between storing results for use in other applications like Anchor Annotator or providing diagnostic information to users, and ensuring that the core MFA workflows were as memory/time efficient as possible. Offloading to a database frees up some memory, and makes some computations more efficient, and should be optimized enough to not slow down regular processing.
|
Database class for storing information about a pronunciation dictionary |
|
Database class for storing information about a dialect |
|
Database class for storing words, their integer IDs, and pronunciation information |
|
Database class for storing information about a pronunciation |
|
Database class for storing phones and their integer IDs |
|
Database class for storing phones and their integer IDs |
|
Database class for storing information about files in the corpus |
|
Database class for storing information about transcription files |
|
Database class for storing information about sound files |
|
Database class for storing information about speakers |
|
Database class for storing information about utterances |
|
Database class for storing information about aligned word intervals |
|
Database class for storing information about aligned phone intervals |
|
Database class for storing information about a particular workflow (alignment, transcription, etc) |
|
Database class for storing information about a phonological rule |
|
Database class for mapping rules to generated pronunciations |
|
Database class for storing information about multiprocessing jobs |
|
Database class for storing information many to many G2P training information |
|
|
|