2.0 Changelog#

2.0.6#

Added G2P and language model training support to Windows
Fixed a bug where exporting trained models to the current working directory would not work GitHub #494
Fixed a crash in exporting transcriptions to TextGrids
Added support for parsing out longer quoted strings GitHub #492
Fix error message for files with no file extensions GitHub #495
Fix PhoneSetType error for some models trained on earlier versions GitHub #496 and GitHub #484

Standardize Pronunciation dictionary format to require tab delimitation between orthography, pronunciations, and any probabilities in the dictionary GitHub #478
Fixed a bug in pronunciation probability estimation when silence words are explicitly transcribed GitHub #476
Fixed an optimization bug introduced when fixing sparse job/subset combos

Optimized Phonetisaurus training regime for phone and grapheme orders greater than 1
Fixed a bug in parsing dictionaries that included whitespace as part of the word
Fixed a bug in Phonetisaurus generation where insertions and deletions were not being properly generated
Changed the default alignment separator for Phonetisaurus to ; instead of } (shouldn’t conflict with most phone sets) and added extra validation to ensure special symbols are not present in the dictionary
Fixed a bug where a trained phonetisaurus model was not properly using its grapheme order
Fixed a bug when saving a phonetisaurus model after evaluating it

Fix typo in save model message GitHub #470
Fix issue with offset alignments when silence words are explicitly in the input transcripts GitHub #471

Updated and expanded documentation
Added ability to train Phonetisaurus style G2P models
Added support for mixing dictionary formats (i.e., lines can be a mix of non-probabilistic or include pronunciation and silence probabilities)
Added support for exporting alignments in CSV format
Updated JSON export format to be more idiomatic JSON GitHub #453
Fixed a crash where initial training rounds with many jobs would result in jobs that had no utterances GitHub #468