FeatureConfigMixin#

class montreal_forced_aligner.corpus.features.FeatureConfigMixin(feature_type='mfcc', use_energy=True, frame_shift=10, frame_length=25, snip_edges=False, low_frequency=20, high_frequency=7800, sample_frequency=16000, allow_downsample=True, allow_upsample=True, dither=0.0001, energy_floor=1.0, num_coefficients=13, num_mel_bins=23, cepstral_lifter=22, preemphasis_coefficient=0.97, uses_cmvn=True, uses_deltas=True, uses_splices=False, uses_voiced=False, adaptive_pitch_range=False, uses_speaker_adaptation=False, fmllr_update_type='full', silence_weight=0.0, splice_left_context=3, splice_right_context=3, use_pitch=False, use_voicing=False, use_delta_pitch=False, min_f0=50, max_f0=800, delta_pitch=0.005, penalty_factor=0.1, **kwargs)[source]#

Bases: object

Class to store configuration information about MFCC generation

Variables:
  • feature_type (str) – Feature type, defaults to “mfcc”

  • use_energy (bool) – Flag for whether first coefficient should be used, defaults to False

  • frame_shift (int) – number of milliseconds between frames, defaults to 10

  • snip_edges (bool) – Flag for enabling Kaldi’s snip edges, should be better time precision

  • use_pitch (bool) – Flag for including pitch in features, defaults to False

  • low_frequency (int) – Frequency floor

  • high_frequency (int) – Frequency ceiling

  • sample_frequency (int) – Sampling frequency

  • allow_downsample (bool) – Flag for whether to allow downsampling, default is True

  • allow_upsample (bool) – Flag for whether to allow upsampling, default is True

  • uses_cmvn (bool) – Flag for whether to use CMVN, default is True

  • uses_deltas (bool) – Flag for whether to use delta features, default is True

  • uses_splices (bool) – Flag for whether to use splices and LDA transformations, default is False

  • uses_speaker_adaptation (bool) – Flag for whether to use speaker adaptation, default is False

  • fmllr_update_type (str) – Type of fMLLR estimation, defaults to “full”

  • silence_weight (float) – Weight of silence in calculating LDA or fMLLR

  • splice_left_context (int or None) – Number of frames to splice on the left for calculating LDA

  • splice_right_context (int or None) – Number of frames to splice on the right for calculating LDA

property alignment_model_path#

Abstract method for alignment model path

calc_fmllr()[source]#

Abstract method for calculating fMLLR transforms

property corpus_output_directory#

Abstract method for working directory of corpus

property data_directory#

Abstract method for corpus data directory

property feature_options#

Parameters for feature generation

property fmllr_options#

Options for use in calculating fMLLR transforms

property lda_options#

Options for computing LDA

property mfcc_options#

Parameters to use in computing MFCC features.

property model_path#

Abstract method for model path

property pitch_options#

Parameters to use in computing pitch features.

property vad_options#

Abstract method for VAD options

property working_directory#

Abstract method for working directory