0% found this document useful (0 votes)
107 views3 pages

Pre-Test Quick Guide

Turns are transcription boxes for speeches with precise start and end times. If a speaker pauses for 500ms or more, end the current turn and create a new one when they resume. Turns cannot exceed 30 seconds. Labels are for non-speech sounds without time limits but two of the same label cannot overlap. Common labels include unintelligible, foreign speech, PII, singing, and noise. Speakers should be numbered until their name is known, then named. Overlapping and intersecting turns from the same speaker are not allowed.

Uploaded by

So
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views3 pages

Pre-Test Quick Guide

Turns are transcription boxes for speeches with precise start and end times. If a speaker pauses for 500ms or more, end the current turn and create a new one when they resume. Turns cannot exceed 30 seconds. Labels are for non-speech sounds without time limits but two of the same label cannot overlap. Common labels include unintelligible, foreign speech, PII, singing, and noise. Speakers should be numbered until their name is known, then named. Overlapping and intersecting turns from the same speaker are not allowed.

Uploaded by

So
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Speech Turns: 

Turns are the transcription boxes created for whenever speeches are heard to be transcribed within those 
boxes. The start time and end time of those boxes/ turns should be created precisely according to the 
time the speech starts and ends. 
 
500 ms 
If a speaker stops talking for 500ms (0.5 seconds) or more, please end the speech turn where s/he 
pauses and create a new turn when s/he resumes his/ her speech. 
 
30 sec 
Speech turns can never exceed 30 seconds per turn, ​not even 1 more millisecond​. This rule doesn’t apply 
to annotations. 
 
Labels/ Annotations 
- Labels/ annotations are any non-speech sound event whether it’s background or foreground sound. E.g., 
sneezing, coughing, music, knocking, Cheering, crying, screaming...etc. 
-Labels have no time limit, but cannot we overlap two annotations of the same kind 
-The 500ms rule applies to annotations too! 
 
Common label types: 
 
Unintelligible 
Spoken words that can’t be heard clearly 
 
Foreign speech 
Any different language/ locale. 
 
PII 
 
All Names, addresses, phone numbers, Email addresses, Payment card numbers, passwords should NOT 
be transcribed. Instead, please label them as PII. 
 
Globally known people are an exception as well as phone numbers and addresses of companies. 
 
Singing 
 
All lyrics of the target locale must be transcribed and ​labeled as singing 
 
Noise 
 
Any background for foreground noise of sorts. 
 
Beginning & Ending of Turns & Labels 
Beginnings and endings of speaker turns and annotations must be accurate at all times. Careful not to 
cut any sounds while creating the start and end time of both labels and turns 
 
 
 
 
Speech overlapping 
when more than one speaker talk at the same time; separate turns under separate speakers should be 
created. 
 
Overlapping speech Turns of the Same Speaker 
Same speaker can't have two turns intersecting at any point 
 
 
Speakers Numbering 
 
Numbered speakers should be used only when you do not know the name of the speaker. For example, 
the first time you identify a speaker, they should be labeled as​ “​speaker 1​”.​ Do ​Not c
​ apitalize the word 
“speaker” 
 
Speaker Naming 

Only use the speaker Name option if the name of the speaker becomes known at some point in the 
audio. E.g.,​ ​speaker Mary​, ​speaker Bob​, s
​ peaker David​. ​Use first and last names if available. E.g., 
‘​speaker David Jones​’
 
 
Unidentifiable Speaker 
This is used when you cannot determine which speaker is talking. It’s common that you might hear the 
same speech by many speakers and fail to determine who is talking and who’s not. This speech should 
be placed under “unidentifiable speaker”  
 
Important: Each task should have o​ nly one​ unidentifiable speaker for all the speakers you cannot 
determine. 
 
Pre recorded speaker 
It can be a recording within the record, a device talking, or background singing.  
Example1: live speaker with background singing. T
​ he singing lyrics should 
be assigned to a pre recorded speaker.  
Example 2: Elevator’s recorded floor announcement. 
Important​: pre recorded speakers shouldn’t be named. 

Interjections 

Refer to your official dictionary (Equivalent to Merriam Webster) and transcribe the common interjections 
and slang confirmations you hear such as: huh, woah, okay, yep, uh-huh, mhm, nah 

 
 

Hesitations and truncations 

If a very small part of a word (at most one syllable) has been cut off, and you know what the word is 
supposed to be, transcribe the entire word. If you are not sure what the word should be, do not transcribe 
the word at all. Do not put punctuation after words that have been cut off. 

Proper Names, Brand Names, Products: 


 
Always use the official spelling and punctuation 
 
Correcting Speakers: 
 
Speakers’ grammar should never be corrected. Any non-standard, slang, offensive words should be 
transcribed as long as they are intended by the speaker. 
 
Slightly mispronounced words by non-natives should be corrected.

Fragments versus sentences:. 

Do not capitalize the beginning of a truncated sentence.   

Do not add end punctuation for incomplete sentences. 

Numbers​: 
 
Cardinals and ordinals from 0 to 9 should be written in letters (except for measures and currency). 
Use digits for 10 and above. E.g.: I have s
​ ix ​dogs and 1
​ 2​ cats. 
 
In math expressions or units & measures, transcribe fraction words using numerals and slashes. Be 
careful not to use pre-combined fractions like "1⁄4".  
Correct example: ​ In 3/4 of a mi, turn right.
 
 
 
 
 
 

You might also like