Changelog

1.1.0

Major changes to system, see what’s new in 1.1

1.0.0

  • Added Grapheme-to-Phoneme capabilities
  • Acoustic models no longer contain the dictionary they were trained with
  • Dictionaries must be specified when aligning using pretrained models
  • The aligner now automatically cleans the temporary directory when the previous run failed
  • Added validation for types of command line arguments
  • Catch and list files that could not be read using UTF-8
  • Update Kaldi version to 5.1 and OpenFST version to 1.6.2 on Mac and Linux
  • Add support for specifying custom non-speech annotations in pronunciation dictionary with sil and spn
  • Made command line flags more consistent in spelling
  • Made pretrained models for many languages available

0.8.0

  • Fixed an issue where aligning using pretrained models was improperly updating the original model with sparser data
  • Added a flag to turn off speaker adaptation when aligning using a pretrained model
  • Optimized training graph generation when aligning using a pretrained model

0.7.3

  • Added warning messages and log output when wav files are ignored because they have too low of a sampling rate or no .lab or .TextGrid file associated with them

0.7.2

  • Fixed an issue where speaker character flags were being ignored when parsing TextGrid files

0.7.1

  • Fixed an issue where the number of gaussians was set too low for triphone training

0.7.0

  • Fixed an issue with unicode characters not being correctly parsed when using --nodict
  • Fixed an issue where short intervals in TextGrid were not being properly ignored
  • Added a command line option --temp_directory to allow for user specification of the temporary directory that MFA stores all files during alignment, with the default of ~/Documents/MFA
  • Added logging directory and some logging for when utterances are ignored

0.6.3

  • Improved memory and time efficiency of extracting channels from stereo files, particularly for long sound files

0.6.2

  • Fixed an issue where pretrained models were not being bundled with the source code

0.6.1

  • Fixed an issue with Linux binaries not finding Kaldi binaries
  • English models now use all of LibriSpeech dataset and not just clean subset (increased number of accents being the primary difference between the two)

0.6.0

  • Added commandline argument --clean to remove temporary files
  • Added support for multiple sampling rates in a single dataset
  • Fix some bugs relating to using a single process
  • Fixed a bug where spaces were being inserted into transcriptions when using --nodict
  • Fixed a bug where having no out-of-vocabulary items would cause a crash at the end of aligning
  • Fixed a bug where the frozen executable could not find the included pretrained models
  • Fixed an issue where dictionaries in model outputs were binary files rather than editable text files
  • Added docstrings to main classes
  • Updated built in model english for the full 1000-hour LibriSpeech corpus

0.5.0

  • Initial release
  • Prosodylab-aligner format supported
  • TextGrid format supported
  • Align using pretrained models supported
  • Train models and align concurrently supported