2.0 Changelog#

2.0.6#

  • Added G2P and language model training support to Windows

  • Fixed a bug where exporting trained models to the current working directory would not work GitHub #494

  • Fixed a crash in exporting transcriptions to TextGrids

  • Added support for parsing out longer quoted strings GitHub #492

  • Fix error message for files with no file extensions GitHub #495

  • Fix PhoneSetType error for some models trained on earlier versions GitHub #496 and GitHub #484

2.0.5#

  • Standardize Pronunciation dictionary format to require tab delimitation between orthography, pronunciations, and any probabilities in the dictionary GitHub #478

  • Fixed a bug in pronunciation probability estimation when silence words are explicitly transcribed GitHub #476

  • Fixed an optimization bug introduced when fixing sparse job/subset combos

2.0.4#

  • Bug fix for phonetisaurus training error in 2.0.2

2.0.2#

  • Optimized Phonetisaurus training regime for phone and grapheme orders greater than 1

  • Fixed a bug in parsing dictionaries that included whitespace as part of the word

  • Fixed a bug in Phonetisaurus generation where insertions and deletions were not being properly generated

  • Changed the default alignment separator for Phonetisaurus to ; instead of } (shouldn’t conflict with most phone sets) and added extra validation to ensure special symbols are not present in the dictionary

  • Fixed a bug where a trained phonetisaurus model was not properly using its grapheme order

  • Fixed a bug when saving a phonetisaurus model after evaluating it

2.0.1#

  • Fix typo in save model message GitHub #470

  • Fix issue with offset alignments when silence words are explicitly in the input transcripts GitHub #471

2.0.0#

  • Updated and expanded documentation

  • Added ability to train Phonetisaurus style G2P models

  • Added support for mixing dictionary formats (i.e., lines can be a mix of non-probabilistic or include pronunciation and silence probabilities)

  • Added support for exporting alignments in CSV format

  • Updated JSON export format to be more idiomatic JSON GitHub #453

  • Fixed a crash where initial training rounds with many jobs would result in jobs that had no utterances GitHub #468