0% found this document useful (0 votes)

107 views10 pages

System Guidlines PDF

This document provides guidance for transcribing audio clips from the Long Transcription system. It outlines key rules for segmenting speaker turns and annotating other audio events. Speaker segments must be labeled and follow rules including allowing a 100ms buffer at segment start/end, splitting segments that exceed 30 seconds, and splitting segments with over 500ms of silence. Annotation types like noise, music and laughter are also defined. Guidelines for applying rules like giving a 1ms gap between segments split by the 30 second rule are provided. A linked video guide covers topics such as the 100ms rule in more detail.

Uploaded by

Ser Hee Paung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views10 pages

System Guidlines PDF

Uploaded by

Ser Hee Paung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Introduction

1. What is Long Transcription LT?

It is a system which contains longer conversation audio clips, either one-sided or
including multiple speakers which must be transcribed as per the rules and regulations of
Guideline along with the Instruction manual.

You will be listening to the dialogue that will likely contain multiple speakers. Your job is
to identify and mark when each speaker is speaking and segment the corresponding
audio.

Some of the audio will contain background noise, background music, and ringtones; this
must be marked too.

2. What are the key points I need to remember to work on

the project?
The key points are as follows (Please refer to FAQs/video guide below for details):
1. You will need to create speaker segments (box) and annotations following the
rules of the system.
2. Speaker types are commonly categorized into three:
● Normal Speakers (Labelled as speaker 1, speaker 2, and so on)
● Unidentifiable Speakers (Labelled as unidentifiable speaker)
● Pre-Recorded Speakers (Labelled as pre-recorded speaker 1, and so on)
● Maximum number of speakers allowed in one task is 20 (also includes
unidentifiable and pre recorded speakers). If more speakers occur in the
task, please stop the task as the 21st speaker is introduced.

3. Annotation types are commonly categorized into following:

● Noise (Labelled for noises and background unintelligible speech of
multiple users)
● Music (Labelled for music)
● DTMF (Dial Tone Multi Frequency- used when you hear a key tone of
phone) (For example: You hear “beep” sound of key when you are in a
phone call)
● Applause (Labelled when crowd applauses/claps/cheers)
● Foreign Speech (Labelled when foreign speech occurs rather than the
language you are working on)
● Laughter (Labelled when laughter is heard)
● RingTone (Labelled when incoming or outgoing ringtone is heard)
● Singing (Labelled when speaker is singing. *Note that songs should also
be labelled under “pre recorded speaker” and annotated under “singing”.)
● Unintelligible (Labelled when transcriber cannot transcribe the word
heard)
● PII (Labelled when transcriber finds any personal information)

**Do not focus on the above two points, work on these points only if
confident of unintelligible and PII parts. (Please refer to the prefill text for
gaining confidence.)

4. 100 ms segment rule

This rule indicates that you will need to keep extra 85-100ms segment at start
and end of each segment where possible to avoid cutting off words outside
segments

5. 30 second split rule

This rule indicates that no speaker segment should exceed 30 second. If a
speaker speaks continuously for more than 30 seconds, a separate speaker
segment needs to be created for speech after 30 seconds, and so on.

6. 500 ms split rule

This rule indicates that if there is a gap of 0.5 second between words, separate
segments need to be created for those two words.

7. 20 speaker rule
This rule indicates that if a task contains more than 20 speakers, it cannot be
continued any further. So, the task needs to be stopped when the 21st speaker is
introduced.

**Note that 30 seconds, 100 ms, and 500 ms rule does not apply to annotations.
FAQ for LT system
1. How much gap should be given at the beginning/end of the
segment?
The segment beginning/end must contain a 100MS gap. Please be sure that the
beginning and end of each segment contains a 100MS gap at most. Do not exceed
100MS. It can lay somewhere in between 85-100MS, but should not exceed 100MS. Be
more precise on this rule in every created segment and note to apply the 100MS rule
accordingly in each created segment. Do not use this for annotations.

Here, the speaker has started speaking from 00:00.090 and the segment has begun
from 0:00.000 which lies in between 85-100MS. Please use this rule at the segmentation
end as well.
Here, the speaker stops speaking at 00:19.202 thus the segment ends at 00:19.299,
which is approximately a 100MS gap.

2. Can we overlap two segments?

Yes, we can overlap two segments but we cannot overlap the same speaker segment.
We can overlap different speakers like: speaker 1 and speaker 2 can overlap each other.
Speaker 1 and speaker 1 cannot overlap each other.

3. In which case should we split the segment?

0.5-sec pause between two utterances must be split into two segments. Whenever the
speaker pauses for 0.5 sec/500MS between the previous and next utterance, then the
segment must be split into two parts.
Here the speaker stops speaking at 00:03.219 and again begins speaking at 00:04.080
which clearly has a gap/pause of greater than 0.5 seconds or 500MS. Thus, in these
cases, two separate segments must be created.

4. How to use annotations?

Be precise on the timestamp of the annotation. Annotation must run only up to the
required mark, not more or not less.
Make sure that the created annotation covers the intended sound of the audio. Be
precise on the use of annotation, it should not contain 100MS gap as per the
segmentation rules.

5. Can we overlap annotations?

Yes, we can overlap annotations but if it’s the same annotation like music overlapping
music, we cannot overlap it.

1 music annotation can be used for different music sounds at the same time.

6. Are sound effects in audio a noise?

Yes, sounds effects in audio should be annotated as noise. Sounds effects like used in
cartoons.

7. Should we include laughter in segments?

We should not include laughter in segments. We should annotate it as laughter.

If the speaker is laughing and speaking at the same time then we should include it in
both segments and annotation.

If the speaker laughs between the conversation, we should annotate it as laughter but
we have to check if the laughter is more than 500ms so that we could split the segment
using 500ms gap rule.

8. Should we treat singing as an annotation or a segment?

We should treat singing as a segment. It falls in the category of annotation but we treat it
as a segment. We apply every segment rule to the singing segment.
9. What is 30 second split rule?
If a segment runs longer than 30 seconds then please split the turn at the 30-second
mark and create a next segment to transcribe after 30 seconds. Create new segments
each at the 30-second mark. Make sure if a word lies at the 30-second mark then omit
the word at first segment and include it in the next segment so that the word is not split
into two segments.

Here, the speaker speaks from 12:58.004 to 13:28.580 which is greater than
30-seconds. Create a segment upto 30-second mark, i.e. 13:28.004 and create a new
segment from 13:28.005. Thus, at every 30-second mark a new segment must be
created if a single segment intends to run for longer than 30 seconds.

10. How much gap should be given between two segments after
using 30 second split rule?
We should give 1ms gap between the two segments. For example: If one segment stops
at 0:30:579 then another segment should start at 0:30:580.
**Please be cautious that multiple segments of the same speaker must contain at
least a gap of 1MS so that it does not overlap the previous segment. If a segment
ends at 00:03.579 and PII begins, then the start time of PII must be 00:03.580 and
so on.

11. Should we give 100ms gap when using the rule of 30 second
split?

If the speaker is speaking for exactly 2 minutes, we should create 4 segments every 30
second.
**Here is the beginning of the segment. We should give 100ms gap as per the rule.

**In this screenshot we split the segments and here we should not give 100ms gap. If we
give 100ms gap from one of the segments then we would overlap each other. So, we
just give 1ms gap between two segments because the speaker is speaking continuously
and we don’t want to miss a word the speaker is speaking.
**Here the speaker has stopped the speech. Now we can give 100ms gap at the end of the
segment.

Video Guides for LT system

Please go to the following link for the video guides for the below listed topics.
https://www.youtube.com/playlist?list=PLzicDqm9vmqbzq5a0-5rU9Aoy711PhyiP

Topics covered:
1. 100ms rule
2. Creating new speaker
3. 30 second rule
4. 500ms rule
5. Labelling the speakers
6. 1ms rule

Eura English Transcription Guidelines 2024 - ADAP QF
No ratings yet
Eura English Transcription Guidelines 2024 - ADAP QF
25 pages
Loft 2.0 User Guide
100% (1)
Loft 2.0 User Guide
20 pages
LOFT 2 User Guide (Quick Version)
No ratings yet
LOFT 2 User Guide (Quick Version)
7 pages
Loft 2.0 User Guide - Google Developers
100% (1)
Loft 2.0 User Guide - Google Developers
18 pages
Indonesia Transcription Guidelines - EN - 0413
No ratings yet
Indonesia Transcription Guidelines - EN - 0413
7 pages
Prosessor - Ai LOFT 2.0 Introduction
No ratings yet
Prosessor - Ai LOFT 2.0 Introduction
24 pages
Feld, Steven 'Waterfalls of Song'
No ratings yet
Feld, Steven 'Waterfalls of Song'
24 pages
Workbench User Manual For Transcribers
No ratings yet
Workbench User Manual For Transcribers
6 pages
Philips CDR 785 CDR 786 Service Manual
100% (1)
Philips CDR 785 CDR 786 Service Manual
67 pages
Gotranscript Transcription Guidelines (Adapted For Translation Into Multiple Languages)
No ratings yet
Gotranscript Transcription Guidelines (Adapted For Translation Into Multiple Languages)
9 pages
IELTS Speaking Part 1 Topics Questions PDF
No ratings yet
IELTS Speaking Part 1 Topics Questions PDF
6 pages
Answers To Further Questions: in GCSE Physics For You (5th Edition)
No ratings yet
Answers To Further Questions: in GCSE Physics For You (5th Edition)
5 pages
GMR Primary Standard Template
No ratings yet
GMR Primary Standard Template
2 pages
Frozen Sheet Music
0% (3)
Frozen Sheet Music
56 pages
Beyma 2009 Designs
No ratings yet
Beyma 2009 Designs
22 pages
Audiometer
0% (1)
Audiometer
19 pages
Text Annotation Guidelines For Hindi ASR
No ratings yet
Text Annotation Guidelines For Hindi ASR
8 pages
Pre-Test Quick Guide
No ratings yet
Pre-Test Quick Guide
3 pages
Ultrasonics Testing Level I
No ratings yet
Ultrasonics Testing Level I
22 pages
Transcription Guiding Principles: Difficult Utterances
No ratings yet
Transcription Guiding Principles: Difficult Utterances
49 pages
Detaliled Lesson Plan in Teaching Music 4
No ratings yet
Detaliled Lesson Plan in Teaching Music 4
8 pages
Hkcee Physics - Section 3 Waves - P.1
No ratings yet
Hkcee Physics - Section 3 Waves - P.1
34 pages
Sound PDF
No ratings yet
Sound PDF
32 pages
String Family
No ratings yet
String Family
10 pages
Khmer Rogue PDF
No ratings yet
Khmer Rogue PDF
3 pages
Somewhere in The Between
No ratings yet
Somewhere in The Between
41 pages
As Physics ISP Waves and Light
No ratings yet
As Physics ISP Waves and Light
44 pages
Musical Instruments of Highland Luzon: 1. Gangsa Is A Single Hand-Held Smooth-Surfaced Gong With A
No ratings yet
Musical Instruments of Highland Luzon: 1. Gangsa Is A Single Hand-Held Smooth-Surfaced Gong With A
3 pages
Ezdi MT Training (Updated)
No ratings yet
Ezdi MT Training (Updated)
53 pages
Eventide H9 Multi-FX Algorithm Guide v12
No ratings yet
Eventide H9 Multi-FX Algorithm Guide v12
85 pages
Music Presentation - Sufjan Stevens PDF
No ratings yet
Music Presentation - Sufjan Stevens PDF
17 pages
Standards For Tagging Malay Long Language Streams
No ratings yet
Standards For Tagging Malay Long Language Streams
11 pages
Auditory Neuroscience: Making Sense of Sound (Mit Press) - ISBN 9780262518024, 978-0262518024
100% (25)
Auditory Neuroscience: Making Sense of Sound (Mit Press) - ISBN 9780262518024, 978-0262518024
23 pages
Instalasi Jaringan Broadcast Polines TV Ruang Studio & Ruang Kontrol
No ratings yet
Instalasi Jaringan Broadcast Polines TV Ruang Studio & Ruang Kontrol
1 page
Mark Drum Manual
No ratings yet
Mark Drum Manual
24 pages
G10 Q1 Music Module 6
100% (1)
G10 Q1 Music Module 6
7 pages
Are We Raising Sexist Men Article PDF
No ratings yet
Are We Raising Sexist Men Article PDF
3 pages
Transcription Guidelines V 1.3 03022020
100% (1)
Transcription Guidelines V 1.3 03022020
24 pages
Poly Blackwire 5200 Series Data Sheet
No ratings yet
Poly Blackwire 5200 Series Data Sheet
3 pages
Sound: Teacher Notes and Answers 12 Sound
No ratings yet
Sound: Teacher Notes and Answers 12 Sound
3 pages
Term Three Options Courses & Clubs
No ratings yet
Term Three Options Courses & Clubs
2 pages
(En - US) Transcribe Long-Form Transcription Guidelines: Release Date: 20191209
No ratings yet
(En - US) Transcribe Long-Form Transcription Guidelines: Release Date: 20191209
24 pages
HIAT Transcription Conventions
No ratings yet
HIAT Transcription Conventions
7 pages
Transcription Guidelines en US v3.0
No ratings yet
Transcription Guidelines en US v3.0
24 pages
Ake ASR Transcription Rule (En) - Long Audio
No ratings yet
Ake ASR Transcription Rule (En) - Long Audio
4 pages
Besitos Tab
No ratings yet
Besitos Tab
6 pages
Huawei Freebuds 4i
No ratings yet
Huawei Freebuds 4i
1 page
Aragorn Training Document
No ratings yet
Aragorn Training Document
34 pages
User Guide - Colloquial Video Annotation
No ratings yet
User Guide - Colloquial Video Annotation
5 pages
ĐỀ 08
No ratings yet
ĐỀ 08
8 pages
Iris EN Long Audio Transcription Project: FAQ Frequent Answers & Questions
No ratings yet
Iris EN Long Audio Transcription Project: FAQ Frequent Answers & Questions
10 pages
Appen Nepali Annotation Guidelines
No ratings yet
Appen Nepali Annotation Guidelines
5 pages
Guide For Transcribing Audio Records: July 2018
No ratings yet
Guide For Transcribing Audio Records: July 2018
8 pages
Transcription Requirements AA
No ratings yet
Transcription Requirements AA
11 pages
Annotation Project
No ratings yet
Annotation Project
11 pages
SJJ Hindi Transcription
No ratings yet
SJJ Hindi Transcription
9 pages
Tiktok Project Rules: Audio Characteristics
No ratings yet
Tiktok Project Rules: Audio Characteristics
7 pages
Shujiajia Audio Transcription & QA
No ratings yet
Shujiajia Audio Transcription & QA
6 pages
Introduction
No ratings yet
Introduction
9 pages
English Transcribing Regulatins
No ratings yet
English Transcribing Regulatins
9 pages
Free Talk Annotation and Transcription Requirement-2022-12-29
No ratings yet
Free Talk Annotation and Transcription Requirement-2022-12-29
7 pages
Transcription Guidelines
No ratings yet
Transcription Guidelines
7 pages
Digitech S200
No ratings yet
Digitech S200
23 pages
Quebec Accent French Colloquial Video Speech Transcription
No ratings yet
Quebec Accent French Colloquial Video Speech Transcription
6 pages
TCS Bangla Guidelines
No ratings yet
TCS Bangla Guidelines
7 pages
Guideline
No ratings yet
Guideline
4 pages
game 外语视频标注规范
No ratings yet
game 外语视频标注规范
6 pages
Transcription Guide - Introduction, Labelling and Segmentation
No ratings yet
Transcription Guide - Introduction, Labelling and Segmentation
6 pages
Transcriber Tool Manual 202001028
No ratings yet
Transcriber Tool Manual 202001028
13 pages
Abtipper - Dresing Und Pehl - Einfache Transkription - ENG - Freelancer
No ratings yet
Abtipper - Dresing Und Pehl - Einfache Transkription - ENG - Freelancer
7 pages
Text Format Descriptions: Full Verbatim
No ratings yet
Text Format Descriptions: Full Verbatim
10 pages
CH 14 Phy 11
No ratings yet
CH 14 Phy 11
2 pages
Transcription Rules - German
No ratings yet
Transcription Rules - German
9 pages
EU Portuguese Natural Conversation Annotation.docx 20240404 170408 ٠٠٠٠
No ratings yet
EU Portuguese Natural Conversation Annotation.docx 20240404 170408 ٠٠٠٠
8 pages
Labelling Rules
No ratings yet
Labelling Rules
4 pages
Loft Rules
No ratings yet
Loft Rules
6 pages
2205h-j Manual
No ratings yet
2205h-j Manual
2 pages
Ake ASR Transcription Rule (EN) - Long Audio - V0117
No ratings yet
Ake ASR Transcription Rule (EN) - Long Audio - V0117
5 pages
Speaker Diarization Guidelines 2024
No ratings yet
Speaker Diarization Guidelines 2024
12 pages
Transcription Rules - English Version
No ratings yet
Transcription Rules - English Version
7 pages
LOFT System Guidelines
No ratings yet
LOFT System Guidelines
17 pages
Audio Transcription Instruction (Praat)
No ratings yet
Audio Transcription Instruction (Praat)
16 pages
Gujarat (Standard Language) Specification
No ratings yet
Gujarat (Standard Language) Specification
6 pages
Avert Transcription Style Guide 1.0
No ratings yet
Avert Transcription Style Guide 1.0
16 pages
Specification For 1000 Hour American English Doctor-Patient Dialogue Annotations
No ratings yet
Specification For 1000 Hour American English Doctor-Patient Dialogue Annotations
7 pages
Appen
No ratings yet
Appen
9 pages
Create Live Backing
No ratings yet
Create Live Backing
2 pages
Audio Data Annotation
No ratings yet
Audio Data Annotation
16 pages
Frequently Asked Questions: Questions About The Project and Cooperation With YPAI Q: A
No ratings yet
Frequently Asked Questions: Questions About The Project and Cooperation With YPAI Q: A
4 pages
Requirement
No ratings yet
Requirement
6 pages
Data Annotation Guideline
No ratings yet
Data Annotation Guideline
8 pages
Summ - Test Music & Arts 7 q1 w1
No ratings yet
Summ - Test Music & Arts 7 q1 w1
4 pages
CENTIFIC Lyra Multi-Speaker Diarization
No ratings yet
CENTIFIC Lyra Multi-Speaker Diarization
6 pages
Transcription Guidelines
No ratings yet
Transcription Guidelines
12 pages
1100 Hours of Tagalog Natural Dialogue Test
No ratings yet
1100 Hours of Tagalog Natural Dialogue Test
7 pages

System Guidlines PDF

Uploaded by

System Guidlines PDF

Uploaded by

Introduction

1. What is Long Transcription LT?

2. What are the key points I need to remember to work on

3. Annotation types are commonly categorized into following:

4. 100 ms segment rule

5. 30 second split rule

6. 500 ms split rule

2. Can we overlap two segments?

3. In which case should we split the segment?

4. How to use annotations?

5. Can we overlap annotations?

6. Are sound effects in audio a noise?

7. Should we include laughter in segments?

8. Should we treat singing as an annotation or a segment?

Video Guides for LT system

You might also like