0% found this document useful (0 votes)

57 views5 pages

MFA Instructions

Uploaded by

grayeggsandsam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views5 pages

MFA Instructions

Uploaded by

grayeggsandsam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Montreal Forced Aligner: Notes and Instructions

I installed the MFA via Miniconda, a minimal installer for conda (a Python/R package-
management distribution) that kept things nice and neat. This would presumably all work with
Python 3 or the like as well.

Miniconda available here:

https://docs.conda.io/en/latest/miniconda.html

Detailed instructions on MFA installation may be found here:

https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html

Once conda/Anaconda/Miniconda/Python/what-have-you is installed, open the prompt:

conda create -n aligner -c conda-forge montreal-forced-aligner

This creates a new alignment ("aligner") and installs the MFA.

There are other installation methods if this does not work (i.e. via pip) – see site above.

Enter the new environment:

conda activate aligner

Now to install other dependencies/tools:

conda install pytorch torchvision torchaudio cpuonly -c pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c

pytorch -c nvidia

pip install speechbrain

Great, the MFA is installed.
It works by taking a dictionary and acoustic model and applying it to your corpus.

There are methods of training your own model, and a ton of dictionary for various languages
already available, but here we'll install the ones for US English. These use the ARPAbet
(essentially ASCII-friendly IPA for US English) – there are ones available in X-SAMPA and
other formats as well.

Re-ensure you are in the aligner environment:

conda activate aligner

And install the dictionary and model:

mfa model download acoustic english_us_arpa

mfa model download dictionary english_us_arpa

Test this by fetching information about the model:

mfa model inspect acoustic english_us_arpa

Great, the model and dictionary are ready. You can specify a directory to install these (check
the website for instructions there) – I just let them sit in the default folder, Documents/MFA.
You can track down the dictionary's wordlist here, and check if you like to see if a given token
is included:

Prepare your dataset. This should be a single folder with paired .wav and .txt files, where the
latter are transcriptions of the former. They should have the exact same filename, except for
the extension.

Transcription convention may depend on the dictionary you're using – for this one,
capitalization of letters does not matter. Nor does spacing except between words, though I
like to use line breaks between clauses and for readability.
This dictionary also has a few "words" for non-speech sounds:

In my experience it's not great at identifying/aligning laughter; when in doubt, go with <unk>.
A transcription will look something like this:

IMPORTANT NOTE: Ensure the encoding for this .txt file is UTF-8.
The aligner will throw a rather unhelpful-to-diagnose error if it's in ANSI.

Put these in the same folder as your audio files. You

can run more than one at a time; if any run into a
critical problem, the aligner will skip over them.

My input path was: C:\Users\graye\Desktop\

MFAProcessing\Input

My output path was: C:\Users\graye\Desktop\

MFAProcessing\Output

(Yes, my folder naming methods are very creative.)

Again, make sure you're in the aligner environment:

conda activate aligner

Validate your dataset to check for errors, replacing the directory as relevant:

mfa validate C:\Users\graye\Desktop\MFAProcessing\Input

english_us_arpa english_us_arpa

Those last two arguments are the model and dictionary.

Assuming there are no errors, align with the following.

mfa align C:\Users\graye\Desktop\MFAProcessing\Input english_us_arpa

english_us_arpa C:\Users\graye\Desktop\MFAProcessing\Output --clean

Do not forget the --clean at the end!

This will take a little while! It does provide helpful progress bars.

The results should be .TextGrid files in your output folder, with the same filenames:

These can then be loaded into Praat alongside their corresponding audio file.
Behold! It's so beautiful.

These TextGrids are formatted with two tiers – the first is word-level, the second is phone-
level in ARPAbet (with numeric marking of stressed syllables). You should manually scan
through the output for discrepancies and errors – I had very little trouble with Rainbow
Passage text readings (this example is pretty typical, and a very clean result), but
spontaneous speech with disfluencies, filler, etc. was more likely to spark some oddities.

Thou has aligned successfully! Now go forth, and make use of Praat scripts!

Voice Recognition
60% (5)
Voice Recognition
31 pages
ASR Building Using Sphinx
100% (2)
ASR Building Using Sphinx
36 pages
Quarter 1 FINAL Grade 8 CSS Learning Material
100% (2)
Quarter 1 FINAL Grade 8 CSS Learning Material
166 pages
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
100% (1)
Automatic Speech Recognition (ASR) : Omar Khalil Gómez - Università Di Pisa
65 pages
MT Slides
No ratings yet
MT Slides
236 pages
Machine Translation and Encoder
No ratings yet
Machine Translation and Encoder
13 pages
L1 Introduction
No ratings yet
L1 Introduction
127 pages
Apertium2 Documentation
No ratings yet
Apertium2 Documentation
214 pages
Forced Alignment and Speech Recognition Systems
No ratings yet
Forced Alignment and Speech Recognition Systems
32 pages
FEM Analysis of Steel Water Tank Using Midas nGEN
No ratings yet
FEM Analysis of Steel Water Tank Using Midas nGEN
35 pages
ASR2018
No ratings yet
ASR2018
40 pages
Xiao Guest Lecture ASR
No ratings yet
Xiao Guest Lecture ASR
39 pages
AACL Machine Translation Tutorial 2023
No ratings yet
AACL Machine Translation Tutorial 2023
145 pages
Mba-Ai Speech Technologies: Prof. Brian Mak
No ratings yet
Mba-Ai Speech Technologies: Prof. Brian Mak
56 pages
Pocket Sphinx
No ratings yet
Pocket Sphinx
31 pages
Asr01 Intro
No ratings yet
Asr01 Intro
43 pages
ASRcourse MOSIG2024
No ratings yet
ASRcourse MOSIG2024
97 pages
Python GuiaUser
No ratings yet
Python GuiaUser
23 pages
Unit 5
No ratings yet
Unit 5
42 pages
LF Aligner Readme
No ratings yet
LF Aligner Readme
15 pages
Lecture11 - AMR
No ratings yet
Lecture11 - AMR
23 pages
Lect 07 - MT and Seq2seq
No ratings yet
Lect 07 - MT and Seq2seq
86 pages
CCS369 - TSS-Unit 5
No ratings yet
CCS369 - TSS-Unit 5
23 pages
2 - Slides N Gram
No ratings yet
2 - Slides N Gram
64 pages
Tutorial On Speech Recognition: Alex Acero Microsoft Research
No ratings yet
Tutorial On Speech Recognition: Alex Acero Microsoft Research
38 pages
Assignment 4
No ratings yet
Assignment 4
13 pages
Introduction To NLP
No ratings yet
Introduction To NLP
68 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
10 pages
Salloum Columbia 0054D 14922
No ratings yet
Salloum Columbia 0054D 14922
201 pages
Mcauliffe17 Interspeech
No ratings yet
Mcauliffe17 Interspeech
5 pages
Natural Language Processing, Problem Set 3: Training Data
No ratings yet
Natural Language Processing, Problem Set 3: Training Data
6 pages
Voice Assistant
No ratings yet
Voice Assistant
34 pages
Inf2a L15 Slides
No ratings yet
Inf2a L15 Slides
31 pages
Lect 05 Preprocessing Text
No ratings yet
Lect 05 Preprocessing Text
25 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
No ratings yet
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
14 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Praat Prosodic Feature Extraction Tool
No ratings yet
Praat Prosodic Feature Extraction Tool
13 pages
Sample
No ratings yet
Sample
8 pages
Lecture
No ratings yet
Lecture
7 pages
Lecture 7 - Automatic Speech Recognition
No ratings yet
Lecture 7 - Automatic Speech Recognition
58 pages
NLP 5
No ratings yet
NLP 5
3 pages
Natural Language Processing (NLP) : Kristen Parton
No ratings yet
Natural Language Processing (NLP) : Kristen Parton
21 pages
224s 22 Lec1
No ratings yet
224s 22 Lec1
31 pages
Irstlm Manual
No ratings yet
Irstlm Manual
8 pages
Natural Language Annotation For Machine Learning A Guide To Corpus Building For Applications James Pustejovsky PDF Download
No ratings yet
Natural Language Annotation For Machine Learning A Guide To Corpus Building For Applications James Pustejovsky PDF Download
52 pages
Speech Recognition, Synthesis, and Dialogue 2
No ratings yet
Speech Recognition, Synthesis, and Dialogue 2
59 pages
Lectures 1 Rabiner Speech Processing
No ratings yet
Lectures 1 Rabiner Speech Processing
77 pages
h3 P
No ratings yet
h3 P
6 pages
Easy Align
No ratings yet
Easy Align
5 pages
Deep Learning For Machine Translation: A Dramatic Turn of Paradigm
No ratings yet
Deep Learning For Machine Translation: A Dramatic Turn of Paradigm
36 pages
Zimsec Computer Science Notes
No ratings yet
Zimsec Computer Science Notes
19 pages
SPR 08 Algorithms
No ratings yet
SPR 08 Algorithms
41 pages
Praat Plugin Book 2.0
No ratings yet
Praat Plugin Book 2.0
42 pages
Ann LA2 Project
No ratings yet
Ann LA2 Project
23 pages
An Introduction To Machine Translation: Andy Way, DCU
No ratings yet
An Introduction To Machine Translation: Andy Way, DCU
23 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
35 pages
Speech Recognition Architecture
No ratings yet
Speech Recognition Architecture
13 pages
CS 224N / Ling 280 - Natural Language Processing: Course Description
No ratings yet
CS 224N / Ling 280 - Natural Language Processing: Course Description
6 pages
Natural Language Processing: Rada Mihalcea
No ratings yet
Natural Language Processing: Rada Mihalcea
27 pages
IPMI Firmware Update
No ratings yet
IPMI Firmware Update
13 pages
Rider Default Win Shortcuts
No ratings yet
Rider Default Win Shortcuts
1 page
Alerton Integration Engine Release: Frequently Asked Questions
No ratings yet
Alerton Integration Engine Release: Frequently Asked Questions
2 pages
Excel 4TH Sem
No ratings yet
Excel 4TH Sem
22 pages
Lab - Hardening A Linux System
No ratings yet
Lab - Hardening A Linux System
5 pages
NetBackup11 Network Ports Reference Guide
No ratings yet
NetBackup11 Network Ports Reference Guide
26 pages
MSFT Cloud Architecture Security
No ratings yet
MSFT Cloud Architecture Security
14 pages
Advanced SQL - LAB 2
No ratings yet
Advanced SQL - LAB 2
11 pages
Cool About It - Boygenius - Piano Tutorial
No ratings yet
Cool About It - Boygenius - Piano Tutorial
1 page
Tms Assembler Directives
No ratings yet
Tms Assembler Directives
523 pages
The Word Wide Web Multimedia
No ratings yet
The Word Wide Web Multimedia
12 pages
Library Management System
No ratings yet
Library Management System
23 pages
2G TEMS Drive Test
No ratings yet
2G TEMS Drive Test
73 pages
Sindura-Jasthi Resume
No ratings yet
Sindura-Jasthi Resume
8 pages
Social Media Half Yearly Report 2023
No ratings yet
Social Media Half Yearly Report 2023
25 pages
ChangeLog Driver 455
No ratings yet
ChangeLog Driver 455
31 pages
x96 Mini Remote - Google Search
No ratings yet
x96 Mini Remote - Google Search
1 page
AY2024 SCTP DIAF L05 TutorialAns
No ratings yet
AY2024 SCTP DIAF L05 TutorialAns
5 pages
Setup Manual ATOM
No ratings yet
Setup Manual ATOM
11 pages
ESPRESSIF IDF Extension For Visual Studio Code Table of Contents (TOC)
No ratings yet
ESPRESSIF IDF Extension For Visual Studio Code Table of Contents (TOC)
6 pages
Operating System Lab (Week 1) : Part 1: Makefile
No ratings yet
Operating System Lab (Week 1) : Part 1: Makefile
14 pages
SRS PKon RxNet Technical
No ratings yet
SRS PKon RxNet Technical
4 pages
Aditya Resume
No ratings yet
Aditya Resume
1 page
Tyrone Assin GIC
No ratings yet
Tyrone Assin GIC
5 pages
Microsoft Office Sharepoint Server (Moss) For Developers: (Part 1)
No ratings yet
Microsoft Office Sharepoint Server (Moss) For Developers: (Part 1)
8 pages
Guide To Using The SAP Integration Plug
No ratings yet
Guide To Using The SAP Integration Plug
6 pages
Skryptozakładka
No ratings yet
Skryptozakładka
2 pages

MFA Instructions

Uploaded by

MFA Instructions

Uploaded by

Montreal Forced Aligner: Notes and Instructions

Miniconda available here:

Detailed instructions on MFA installation may be found here:

Once conda/Anaconda/Miniconda/Python/what-have-you is installed, open the prompt:

conda create -n aligner -c conda-forge montreal-forced-aligner

This creates a new alignment ("aligner") and installs the MFA.

Enter the new environment:

conda activate aligner

Now to install other dependencies/tools:

conda install pytorch torchvision torchaudio cpuonly -c pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c

pip install speechbrain

Re-ensure you are in the aligner environment:

conda activate aligner

And install the dictionary and model:

mfa model download acoustic english_us_arpa

Test this by fetching information about the model:

mfa model inspect acoustic english_us_arpa

Put these in the same folder as your audio files. You

My input path was: C:\Users\graye\Desktop\

My output path was: C:\Users\graye\Desktop\

(Yes, my folder naming methods are very creative.)

conda activate aligner

mfa validate C:\Users\graye\Desktop\MFAProcessing\Input

Those last two arguments are the model and dictionary.

mfa align C:\Users\graye\Desktop\MFAProcessing\Input english_us_arpa

Do not forget the --clean at the end!

You might also like