Data preparationΒΆ

Prior to running the aligner, make sure the following are set up:

1. A pronunciation dictionary for your language should specify the pronunciations of orthographic transcriptions.

  1. The sound files to align.
  2. Orthographic annotations in .lab files for individual sound files (Prosodylab-aligner format) or in TextGrid intervals for longer sound files (TextGrid format)


A collection of preprocessing scripts to get various corpora of other formats is available in the MFA-reorganization-scripts repository.