Montreal Forced Aligner
Montreal Forced Aligner is: "a command line utility for performing forced alignment of speech datasets using Kaldi" as stated in their Github repo.
Usage
For this particular case I used the following command:
mfa align tanakh_niqqud hebrew_words_english_sounds.dict english tanakh_niqqud_aligned/
lets review each parameter individually
mfa align
With the installation complete along with its dependencies, we can use the "align" function from "mfa"
tanakh_niqqud
This is the folder where we need to place pairs of files with extensions "lab": text and "wav": audio; for example:
(base) netmaster@netmaster-VirtualBox:~/mfa/mfa_data/tanakh_niqqud$ ls -ltotal 5952-rw-rw-r-- 1 netmaster netmaster 985 Shin 22 18:21 t2601.lab-rw-rw-r-- 1 netmaster netmaster 2321890 Shin 22 18:21 t2601.wav-rw-rw-r-- 1 netmaster netmaster 1455 Shin 22 18:21 t2602.lab-rw-rw-r-- 1 netmaster netmaster 3763864 Shin 22 18:21 t2602.wav
"lab" files
Each "lab" file must contain the text matching exactly its correspondng "wav" file. One step needed to produce this text was the replacement of the word "יְהוָה" (Yaweh) for "אֲדֹנָי" (Adonay) because the reader of the text follows that tradition while reading; also the characters like "-" and "׃" needed to be removed.
אַשְׁרֵי הָאִישׁ אֲשֶׁר לֹא הָלַךְ בַּעֲצַת רְשָׁעִים וּבְדֶרֶךְ חַטָּאִים לֹא עָמָד וּבְמוֹשַׁב לֵצִים לֹא יָשָׁב כִּי אִם בְּתוֹרַת אֲדֹנָי חֶפְצוֹ וּבְתוֹרָתוֹ יֶהְגֶּה יוֹמָם וָלָיְלָה וְהָיָה כְּעֵץ שָׁתוּל עַל פַּלְגֵי מָיִם אֲשֶׁר פִּרְיוֹ יִתֵּן בְּעִתּוֹ וְעָלֵהוּ לֹא יִבּוֹל וְכֹל אֲשֶׁר יַעֲשֶׂה יַצְלִיחַ לֹא כֵן הָרְשָׁעִים כִּי אִם כַּמֹּץ אֲשֶׁר תִּדְּפֶנּוּ רוּחַ עַל כֵּן לֹא יָקֻמוּ רְשָׁעִים בַּמִּשְׁפָּט וְחַטָּאִים בַּעֲדַת צַדִּיקִים כִּי יוֹדֵעַ אֲדֹנָי דֶּרֶךְ צַדִּיקִים וְדֶרֶךְ רְשָׁעִים תֹּאבֵד
hebrew_words_english_sounds.dict
A text file containing on each line a word and its phonetic representation
אֲדֹנָי AE0 D AH0 N EY1אִם IH0 Mעַל AH0 Lעָמָד AA1 M AA1 Dאֲשֶׁר AE1 SH EH1 Rאַשְׁרֵי AE1 SH R EH1בַּעֲצַת B AA1 AH0 T S AA1 Tבַּעֲדַת B AA1 AA1 D AA1 Tבַּמִּשְׁפָּט B AA1 M IH0 SH P AE2 THבְּעִתּוֹ B EH1 IH0 T OW0בְּתוֹרַת B EH1 T OW1 R AA1 Tצַדִּיקִים T SH AA1 D IH0 K IY1 Mחַטָּאִים JH AH0 T AA1 IY1 Mחֶפְצוֹ JH EH1 F T S OW1דֶּרֶךְ D EH1 R EH1 JH EH1הָאִישׁ AY2 SHהָלַךְ JH AH0 L AH0 JHהָרְשָׁעִים AH0 R EH1 SH AH0 IH0 Mכַּמֹּץ K AH0 M OW1 T Sכְּעֵץ K EH1 T Sכֵּן JH EH1 Nכִּי K IH0לֵצִים L EH1 T S IY1 Mלֹא L OW1מָיִם M AA1 Y IH0 Mפַּלְגֵי P AA1 L G EH1פִּרְיוֹ P IH0 R Y OW1רְשָׁעִים R EH1 SH AA1 IH0 Mרוּחַ׃ R UH0 AH0 JHשָׁתוּל SH AA1 T UW2 Lתִּדְּפֶנּוּ T IH0 D EH1 F EH1 N UH0תֹּאבֵד T OW1 B EH1 Dוּבְדֶרֶךְ UW2 V D EH1 R EH1 JHוּבְמוֹשַׁב UW2 V M OW1 SH AH0 Vוּבְתוֹרָתוֹ UW2 V T AO1 R AA1 T OW0וָלָיְלָה V AA1 L AA1 Y L AH0וְעָלֵהוּ V EH1 AA1 L EH1 UW2וְחַטָּאִים V EH1 JH AH0 T AA1 IH0 Mוְדֶרֶךְ V EH1 D EH1 R EH1 JHוְהָיָה V EH1 JH AA1 Y AH0וְכֹל V EH1 JH OW1 Lיַעֲשֶׂה Y AA1 AA1 S EH1יַצְלִיחַ Y AA1 T S L IH0 AH0 JHיָקֻמוּ Y AA1 K UH0 M UH0יָשָׁב Y AA0 SH AA0 Vיֶהְגֶּה Y EH0 G EH0יִבּוֹל Y IH0 B OW1 Lיִתֵּן Y IH0 T EH0 Nיוֹדֵעַ Y OW1 D EH0 AH0יוֹמָם Y OW1 M AA0 M
This is one of the points where complications arise, mainly due to the lack of Hebrew language trained models. For the MontrealFA library in particular there were no trained models available for Hebrew, nor a lexicon of words-phonetics that I could find. Fortunatelly, using the, very well trained, English Acoustic Model along with a lexicon that I build myself using english-phonetics for Hebrew words, the results were excellent. The only issue now is to build or find this HebrewWord-English-phonemes pairs for all the words in the Tanakh.
For this part, using the translation table obtained from Biblehub, a lexicon for each chapter can be generated and used.
english
this sets the acoustic model to use during the alignment process, so we will use the english model to match our english-phonemes. Maybe later, using all the matching words and phonemes times it would be interesting use MontrealFA to train an Hebrew model.
tanakh_niqqud_aligned
Finally we provide the name of the folder where the results will appear.
The results
The Force Alignment process, for the first two Psalms looks as follows:
(base) netmaster@netmaster-VirtualBox:~/mfa/mfa_data$ conda activate aligner(aligner) netmaster@netmaster-VirtualBox:~/mfa/mfa_data$ mfa align tanakh_niqqud hebrew_words_english_sounds.dict english tanakh_niqqud_aligned/All required kaldi binaries were found!/home/netmaster/Documents/MFA/tanakh_niqqud/align.logINFO - Setting up corpus information...INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 2.0INFO - Parsing dictionary without pronunciation probabilities without silence probabilitiesINFO - Creating dictionary information...INFO - Setting up training data...Generating base features (mfcc)...Calculating CMVN...INFO - Done with setup!INFO - Performing first-pass alignment...INFO - Calculating fMLLR for speaker adaptation...INFO - Performing second-pass alignment...INFO - All done!
which results in the following files on the output folder:
-rw-rw-r-- 1 netmaster netmaster 47230 Shin 22 21:42 t2601.TextGrid-rw-rw-r-- 1 netmaster netmaster 33341 Shin 22 21:42 t2602.TextGrid
were each file contains the words and phonemes matchin timestamps in a format like:
File type = "ooTextFile"Object class = "TextGrid"xmin = 0xmax = 52.6488tiers? <exists>size = 2item []:item [1]:class = "IntervalTier"name = "words"xmin = 0xmax = 52.6488intervals: size = 103intervals [1]:xmin = 0xmax = 0.07text = ""intervals [2]:xmin = 0.07xmax = 0.76text = "אַשְׁרֵי"intervals [3]:xmin = 0.76xmax = 1.45text = "הָאִישׁ"
this is where the forced alognment process finishes
Example
"Psalm 1" is aligned using the following command:
mfa align --clean by-chapter/t2601 by-chapter/t2601/t2601.dict english by-chapter-aligned/t2601
Testing the Results
The resulting ".TextGrid" file can be read with Praat,
I got to use the small python visualizer I coded too, just adjusted it a bit to be able to process the data exported as a table from Praat