Utterance#
- class montreal_forced_aligner.db.Utterance(**kwargs)[source]#
Bases:
Base
Database class for storing information about utterances
- Parameters:
id (int) – Primary key
begin (float) – Beginning timestamp of the utterance
end (float) – Ending timestamp of the utterance, -1 if there is no audio file
duration (float) – Duration of the utterance
channel (int) – Channel of the utterance in the audio file
num_frames (int) – Number of feature frames extracted
text (str) – Input text for the utterance
oovs (str) – Space-delimited list of items that were not found in the speaker’s pronunciation dictionary
normalized_text (str) – Normalized text for the utterance, after removing case and punctuation, and splitting up compounds and clitics if the whole word is not found in the speaker’s pronunciation dictionary
features (str) – File index for generated features
in_subset (bool) – Flag for whether to use this utterance in the current training subset
ignored (bool) – Flag for if the utterance is ignored due to lacking features
alignment_log_likelihood (float) – Log likelihood for the alignment of the utterance, taking both speech and silence phones into consideration
speech_log_likelihood (float) – Log likelihood for the alignment of the utterance, taking only the speech phones into consideration
duration_deviation (float) – Average of absolute z-score of speech phone duration
phone_error_rate (float) – Phone error rate for alignment evaluation
alignment_score (float) – Alignment score from alignment evaluation
word_error_rate (float) – Word error rate for transcription evaluation
character_error_rate (float) – Character error rate for transcription evaluation
file (
File
) – File object that the utterance is fromspeaker (
Speaker
) – Speaker object of the utterancephone_intervals (list[
PhoneInterval
]) – Reference phone intervalsword_intervals (list[
WordInterval
]) – Aligned word intervalsjob (
Job
) – Job that processes the utterance
- property aligned_phone_intervals#
Phone intervals from
montreal_forced_aligner.data.WorkflowType.alignment
- property aligned_word_intervals#
Word intervals from
montreal_forced_aligner.data.WorkflowType.alignment
- property file_name#
Name of the utterance’s file
- classmethod from_data(data, file, speaker, frame_shift=None)[source]#
Generate an utterance object from
UtteranceData
- Parameters:
data (
UtteranceData
) – Data for the utterancefile (
File
) – File database object for the utterancespeaker (
Speaker
) – Speaker database object for the utteranceframe_shift (int, optional) – Frame shift in ms to use for calculating the number of frames in the utterance
- Returns:
Utterance object
- Return type:
- property per_speaker_transcribed_phone_intervals#
Phone intervals from
montreal_forced_aligner.data.WorkflowType.per_speaker_transcription
- property per_speaker_transcribed_word_intervals#
Word intervals from
montreal_forced_aligner.data.WorkflowType.per_speaker_transcription
- phone_intervals_for_workflow(workflow_id)[source]#
Extract phone intervals for a given
CorpusWorkflow
- Parameters:
workflow_id (int) – Integer ID for
CorpusWorkflow
- Returns:
List of phone intervals
- Return type:
list[
CtmInterval
]
- property phone_transcribed_phone_intervals#
Phone intervals from
montreal_forced_aligner.data.WorkflowType.phone_transcription
- property reference_phone_intervals#
Phone intervals from
montreal_forced_aligner.data.WorkflowType.reference
- property speaker_name#
Name of the utterance’s speaker
- to_data()[source]#
Construct an UtteranceData object that can be used in multiprocessing
- Returns:
Data for the utterance
- Return type:
- to_kalpy()[source]#
Construct an UtteranceData object that can be used in multiprocessing
- Returns:
Data for the utterance
- Return type:
- property transcribed_phone_intervals#
Phone intervals from
montreal_forced_aligner.data.WorkflowType.transcription
- property transcribed_word_intervals#
Word intervals from
montreal_forced_aligner.data.WorkflowType.transcription
- word_intervals_for_workflow(workflow_id)[source]#
Extract word intervals for a given
CorpusWorkflow
- Parameters:
workflow_id (int) – Integer ID for
CorpusWorkflow
- Returns:
List of word intervals
- Return type:
list[
CtmInterval
]