class montreal_forced_aligner.db.SoundFile(**kwargs)[source]#

Bases: Base

Database class for storing information about sound files

  • file_id (int) – Foreign key to File

  • file (File) – Root file

  • sound_file_path (Path) – Path to the audio file

  • format (str) – Format of the audio file (flac, wav, mp3, etc)

  • sample_rate (int) – Sample rate of the audio file

  • duration (float) – Duration of audio file

  • num_channels (int) – Number of channels in the audio file

  • sox_string (str) – String that Kaldi will use to process the sound file

normalized_waveform(begin=0, end=None)[source]#

Load a normalized waveform for acoustic processing/visualization

  • begin (float, optional) – Starting time point to return, defaults to 0

  • end (float, optional) – Ending time point to return, defaults to the end of the file


  • numpy.array – Time points

  • numpy.array – Sample values