Transcriptionformat
Transcriptionformat
0 (Updated Jul/16/2022)
All transcripts MUST label the Speaker. Speaker labels must always have a space after the colon.
Example: John: This is a nice day.
Use only square brackets “ [] “ around tags, never parentheses, (), curly brackets, {}, angle brackets,
<>, or any other bracket-type symbols.
Important Criteria:
• Non-verbatim transcripts: must be cleaned up, not verbatim. No uhm, arrh etc.. (See below
section on Verbatim / Non Verbatim)
• Uses standard English, not netspeak or phonetics (amirite? cuz u know its not profeshnul etc…);
• DO NOT paraphrase (do not type the gist of what you heard, but the words actually used).
• NEVER type anything that is not spoken in the audio, including your comments or the job
number or title.
• Do not make up words. There are two ways in which we mean this:
Do not spell words phonetically. All words should be spell checked and must be actual English
words, unless the speaker was deliberately making up words, such as "what awesome
majorness!" Otherwise,
Do not include words just because they sound similar to the syllables that were spoken. This is
one of the co-owners’ pet peeves. These tips should prevent that problem:
• Transcribe contraction as spoken by the speaker. If the speaker says “He’ll” then write as “He’ll”,
and if he says “He will” then write as “He will”.
• Read your transcript before submitting, as if you were reading an article or story. If the words
you used do not make sense in each sentence, they are probably not the words the speaker was
saying.
• Tag any words that you are uncertain about or can’t get with [?] i.e uncertain[?]
• Use all the information you have available. There are a few major ways in which you can get
extra information, and we ask that you use them:
●
• ALWAYS read the Extra Comment area see if contain clues such as the speaker’s names or the
correct spelling for certain terms mentioned in the audio!
• The audio itself can give you new information. For example, if at the end, the interviewer says,
"Thanks, Dave, for this interview!", and the interviewee’s response clearly indicates that he is
Dave, then you must go back and relabel the interviewee as Dave: throughout the transcript.
• For both verbatim and non-verbatim, when one person is speaking and another says nothing but
"uh-huh" or “mm-hmm” in the meantime, leave out all those murmuring noises as long as
they aren’t an answer to a question from the speaker.
Example:
WRONG:
John: I went downtown today.
Joe: Uh-huh.
John: I like it there.
Joe: Mm-hmm.
John: And I think I’m getting better at finding parking spots!
RIGHT:
John: I went downtown today. I like it there. I think I’m getting better at finding parking spots!,
Because none of what the other person said was important.... If the "Mm-hmm" is in response
to a direct question from the speaker, and the speaker waited for that answer, then include it to
keep the transcript making sense.
Speaker Labels:
• Each change in speakers should be placed on a new line. Add a blank line before the changed
speaker.
• Standard format: A complete speaker label includes a colon after the label as well as a space
after that colon. Do this: “Woman 2: “
• When to use: whenever the speakers change, or whenever something happens on a separate
line (like [laughter]) in the middle of a person speaking (even if the same person keeps speaking
after).
• Order of preference for labels: Use names whenever possible, then roles, then use gender as the
last resort.
• Full names: When you have information about a speaker’s full name (from Extra Comment or
because they state their name or are announced by name), use that the first time they appear in
your audio chunk. After that, use only their first name if known, or last name if first name not
known. Do not use full name.
• Descriptiveness: Make each speaker’s label as informative as possible about the person’s role in
the audio. Except in the case of large groups (see special subsection, later in this section), labels
must be useful for telling one person from another. Woman 1: is acceptable, but Interviewer: or
●
Host: is much better. Other roles that may apply (use your judgment): Congregant:, Audience
Member: , Passerby: , Announcer: , Interviewee: .
• Adding gender: Use Male and Female only as adjectives for roles, never by themselves. Only
mention gender at all if people of different genders have the same role in the audio. Like this:
Male Host: , Female Host: , but two female hosts would just be Host 1: and Host 2: .
• Adding numbers and cutting down on clutter: Always use numbers with "Man" or “Woman”
labels. Do not use numbers if the speaker has a role other than just “Man” or “Woman,”
unless the audio includes two or more people of the same gender who are playing that same
role. Like this: Woman 1: and Man 2: , or (if there are two male hosts and one female one)
Female Host: , Male Host 1: , and Male Host 2:.
• Audience: is the label for an audience as a whole, unless they are gathered in a church or other
place of worship, which makes them a Congregation:.
• A single member of that group is labeled Audience Member: or Congregant: .
• If there are already two or more other speakers in your audio, don’t worry about telling the
audience or congregation apart. Each one will just be Audience Member: or Congregant: , with
no numbers or gender needed.
• However, if there is only one main speaker on the audio, then be more detailed in specifying the
first two group members who speak. We prefer you do this by mentioning gender (if they have
different genders from each other): Female Audience Member:. If they are both the same
gender, then add a number to their labels instead: Audience Member 2: .
- Hello, how are you? - I am all well, thank you. - That’s Great!
Tag List:
Tags list for things you couldn’t make out, with brief definitions
• When there are no real speakers: [background sounds only], [background conversations] or
[silence] are completely OK to use. If the audio file is completely silent, email us at Emergency
Escalation @ [email protected] or click on the live chat button -- It may be an
incomplete or corrupted audio file. NEVER SUBMIT AN EMPTY FILE.
• Signs that an audio file may be corrupted include; all static; high-pitched squealing; high-speed,
high pitched voices, etcetera.
• Simply put, if it's out of the ordinary, email Emergency Escalation or contact live chat support.
• End all transcripts with [End].
Verbatim Transcripts
Do not summarize -- write down exactly as you hear it. A verbatim transcript is prepared by transferring
each and every utterance, including those that are non-verbal, and even the [pauses], [laughter],
[silence] and [throat clearing] etc.. in an audio file exactly the same way as delivered.
In a verbatim transcript, false starts, repetitions, or grammatical errors are copied faithfully and
delivered without being tidied up, or by being made more concise. The reader receives a true copy of an
event with the words transcribed exactly the way they were spoken, thereby according a movie-
dialogue, realistic feel to the transcript. Such a transcript is quite helpful when an interview is being
documented or serves as a testimonial for legal purposes, as the thought process gets implied through
verbal cues, such as repeated words or phrases, or awkward hesitations.
Transcribe every utterance, including repetitive phrasing, false starts, filler words like, "um," "uh,"
"er," etc., and every "I mean," "you know," etc (there are grammatically correct uses of those and
similar words/phrases, even in non-verbatim).
All slang should be retained, e.g., "gonna," "kinda," "sorta, "cuz," (or "coz"), etc., should be exactly as
spoken. In other words, do not make any kind of grammatical corrections to the language.
When multiple speakers are involved, a verbatim transcript indicates segments in which there is an
overlap of voices.
●
Non-Verbatim Transcripts
Businesses that are looking to get meetings transcribed, or academicians who want to provide their
lectures to the students in the written form, wouldn't want to include something like a verbal nod
included in that transcript, but rather would want to go with a clean transcript so that it is more reader-
friendly.
A non-verbatim or “intelligent” transcription, rather than typing the words exactly the way they are
spoken, captures the fundamental meaning behind them. Errors in grammar are rectified and words or
sounds that don't contribute to the underlying message are removed. If fillers or repetitions occur
naturally in the speakers' speech patterns, they are simply removed by the transcriptionist. In other
instances, paraphrasing of a statement is required which conveys the same idea, but more succinctly. A
non-verbatim transcript can be published online without edits, or it can serve as a marketing piece.
All slang should be changed to proper spelling, e.g., "gonna" changed to "going to", "kinda," change to
"kind of", "cuz," (or "coz") changed to "because".
• Where possible, break compound sentences into smaller ones. Long sentences should be broken
into fragments.
• Insert a blank line between paragraphs. Also, start a new paragraph at every speaker change.
• Follow correct grammar. All sentences should start with a capital letter and have the correct
punctuation.
• If a single speaker speaks through-out without speaker change, break into different paragraphs
as the topics change. New line does not need speaker Label.
Transcription Examples:
Jerry: Correct. because you can see, I am labeled with just my first name now. If you don’t know my first
name, then you may name me by my last name. Mr Atkins[?] or Mrs Atkins[?] if female.
Man 1: Hi, I'm a new speaker. No one ever mentions my name, so the transcriber going to [inaudible]
give me a descriptive name. Here, the only information that can be gathered on me is that I'm male. So
in this case I'm Man 1.
●
Man 2: Now, there are two identified males. No one ever mentions my name either, I am identified later
than Man 1, so I’d be Man 2.
[END]
Jerry: Correct, umm cuz you can see, I am labeled with just my first name now. If you don’t know my first
name, then you may name me by my last name. Mr Atkins[?] or Mrs Atkins[?] if female.
Man 1: Hi, I'm a new speaker. No one ever mentions my name, so the transcriber gonna [inaudible] give
me a descriptive name. Here, the only information that can be gathered on me is that I'm male. So in this
case I'm Man 1.
Man 2: Now, there are two identified males. No one ever mentions my name either, I am identified later
than Man 1, so I’d be Man 2.
[END]
Example:
VERBATIM:
Man 1: Yeah! She ju-, uh, just suddenly lost consciousness.
Man 2: O-okay. W-, what have you done then? Did you, uh, bring her to the clinic?
NON-VERBATIM:
Man 1: Yeah! She just suddenly lost consciousness.
Man 2: Okay. What have you done then? Did you bring her to the clinic?
●
Stutters
Stuttering, also called as stammering, is a speech disorder in which the flow of speech is disrupted by
involuntary prolongations. Stammers are removed in Non-Verbatim transcripts but retained in Verbatim
Transcripts. Examples:
VERBATIM: W-w-w-w-w-well, I-I, uh, I th-th-thought that sh-she uh, s-she, uh, she left a-a-already.
VERBATIM
David: Did you see Francis point the gun to Mrs. Gomez?
Ivan: [sighs]
VERBATIM:
NON-VERBATIM:
Example 2:
A patient is talking and the doctor interrupts with a question that is answered, do the following:
Non-Verbatim Example:
Patient: The pain is in this area [crosstalk] below the knee.
The doctor interjected before the patient finished speaking, but we do not want to split the sentence
into two paragraphs. Therefore, the patient's sentence is completed, and then the doctor's question is
inserted into a new paragraph.
Numbers:
• Numbers: Phone number, street address, zip code, date, year, unit of measurement, numbers
between 0-9 should be written as numeric. ALL other numbers such as fractions, decimals,
month etc. should ONLY use words to represent. (Special case for time and money.)
• Time: - If an exact time is mentioned, write it as “8:11 a.m.”, If the speaker says “o’clock”, write
as spoken: “eight o’clock.” If the speaker doesn’t mention an exact time, write as words. “Let’s
have dinner at nine.” Days: A.D. 2010, the 1980s, the ‘90s, 21st century.
• Spell out units of measurement, such as “inches,” “feet,” “yards,” “miles,” “ounces,” “pounds,”
and “tablespoons.” However, if spoken in shortened form, symbols should be used. Example:
o WRONG: I'm five-eight
o TRANSCRIPTION: I'm 5'8''.
• Use numerals and the percent sign to indicate all percentages except at the beginning of a
sentence. Examples:
o Middle sentence: Only 6% of the votes were counted.
o Beginning of the sentence: Fifty-one percent of the group voted.
• Use the numeral plus the lowercase “th,” “st,” or “nd” when a day of the month is mentioned by
itself (no month is referred to). Example:
WRONG: John went diving on the eighth.
RIGHT: John went diving on the 8th.
• When the day precedes the month, use the numeral plus the lowercase “th,” “st,” or “nd” if the
ending is spoken. Example: My birthday is on the 9th of May.
• Use the numeral alone when the day follows the month. Example: I will get back to you on
September 16.
• When the month, day, and year are spoken, use the numeral alone for the day, even if an ending
(“th,” “st,” or “nd”) is spoken. Example:
NARRATION: Paul will marry Sue on July sixth, nineteen eighty.
TRANSCRIPTION: Paul will marry Sue on July 6, 1980.
• Use the numeral plus “cents” or “¢” for amounts under one dollar. Examples: I need 15 cents. I
owe you 32¢.
• Use the dollar sign plus the numeral for dollar amounts under one million. For whole-dollar
amounts of one million and greater, spell out “million,” “billion,” etc. Examples:
●
• Use the word “dollar” when describe a range, and upto to ten dollars.
Example: I hope I find three or four dollars.
• Use the dollar sign and numerals when transcribing a range of currency over ten dollars.
Example: Alice expected a raise of $6000 to $7000.
Colloquy is a common format for court transcriptions. Please read this document very carefully.
1. All speaker labels must be capitalized and ONLY use LAST name. i.e MS. SMITH: MR. JOHNSON:
2. Judge/Arbitrator etc.. always use THE COURT:
3. Single Space, colloquy should by default use single space (Unless requested by client to use double
space).
4. During Direct Examination always use Q. and A. and put a "DIRECT EXAMINATION BY <NAME>' above
the question to designate the person asking the questions.
Example:
DIRECT EXAMINATION BY MR. SMITH:
Q. What is your name?
A. My name is Susan Joans
Q. How old do you think the victim is?
A. He is 21.
When the Q&A are interrupted, format the colloquy as described in the Colloquy formatting. Once the
Q&A resumes, put a "BY <NAME>' above the next question to designate the person asking the questions.
Example:
Speakers Label:
• Order of preference for labels: Use names whenever possible, then roles, then use gender as the
last resort.
• Descriptiveness: make each speaker’s label as informative as possible about the person’s role in
the audio. Except in the case of large groups (see special subsection, later in this section), labels
must be useful for telling one person from another. Woman 1: is acceptable, but Interviewer: or
Host: is much better. Other roles that may apply (use your judgment): Congregant:, Audience
Member: , Passerby: , Announcer: , Interviewee: .
• Whenever possible, relabel speakers shown only by their gender, to make their label into a
descriptive one. Do this: "Woman 1: “ becomes Interviewer:.
• The audio itself can give you new information. For example, if at the end, the interviewer says,
"Thanks, Dave, for this interview!", and the interviewee’s response clearly indicates that he is
Dave, then you must go back and relabel the interviewee as Dave: throughout the transcript.
• Full names: when you have information about a speaker’s full name (from the coversheet or
because they state their name or are announced by name), use that the first time they appear in
●
your audio chunk. After that, use only their first name if known, or last name if first name not
known. Do not use full name.
• Adding gender: use Male and Female only as adjectives for roles, never by themselves. Only
mention gender at all if people of different genders have the same role in the audio. Like this:
Male Host: , Female Host: , but two female hosts would just be Host 1: and Host 2: .
• Adding numbers and cutting down on clutter: Always use numbers with "Man" or “Woman”
labels. Do not use numbers if the speaker has a role other than just “Man” or “Woman,” unless
the audio includes two or more people of the same gender who are playing that same role. Like
this: Woman 1: and Man 2: , or (if there are two male hosts and one female one) Female Host: ,
Male Host 1: , and Male Host 2:.
• If you are editing an audio file with multiple transcribed files (split up files), you MUST make sure
the speaker label is consistent, and it may need to be relabeled as two or more transcribers have
worked on a single audio file, and they would most likely have labelled same person differently.
• The extra comment area contains clues such as the speaker’s names or the correct spelling for
certain terms mentioned in the audio. so please check that!
Standard Format
• A complete speaker label includes a colon after the label as well as a space after that colon. Do
this: “Woman 2: “
• Each change in speakers should be placed on a new line. Add a blank line before the changed
speaker.
• Whenever the speakers change, or whenever something happens on a separate line (like
[laughter]) in the middle of a person speaking (even if the same person keeps speaking after),
always standard a new sentence with speaker label “Speaker: “.
• Fix any line breaks in the middle of your paragraphs before submitting the work.
●
• Where possible, break compound sentences into smaller ones. Long sentences should be broken
into fragments.
• Insert a blank line between paragraphs. Also, start a new paragraph at every speaker change.
• Follow correct grammar. All sentences should start with a capital letter and have the
correct punctuation.