Guide On Using CLAN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Language Sample Analysis Using CLAN

This document aims to show you how to make use of CLAN to analyse corpus data of Cantonese-
speaking children with typical language development available on CHIld Language Data Exchange
System (CHILDES).

1. Download CLAN and CLAN manual (September 30, 2020)


o https://dali.talkbank.org/clan/
o https://talkbank.org/manuals/CLAN.pdf
2. Download CHAT manual
3. Download transcripts of typically developing children aged 4 (040021.cha)
4. Start using CLAN
5. Enter commands and run data analysis in CLAN: MLU run

Speech to text program:

• https://speechnotes.co/

1
1. CHILDES:
• https://childes.talkbank.org/
• Child Language Component of TalkBank
• A project organized by Brian MacWhinney at Carnegie Mellon University
• Supported by hundreds of contributors and dozens of collaborators for open-data sharing
• Over 34 languages
• Using a consistent transcription format for data storage called CHAT
• Develop free programs for automatic analysis and searching, e.g. CLAN

2. Download CLAN:
• http://dali.talkbank.org/clan/
• Download and install the appropriate version for your computer (V 10-Oct-2021 11:00)

2
• Open a File Explorer, check if the CLAN program is shown in c:\talkbank\CLAN

3. Download Manuals: “CHAT” & “CLAN”


• https://talkbank.org/manuals/CLAN.pdf (Sept 2021)
• https://talkbank.org/manuals/CHAT.pdfm (Aug 2021)

4. Access and download Cantonese corpus data:


• Click **Index to Corpora** to access copora database
▪ https://childes.talkbank.org/access

3
• To access Chinese database
▪ https://childes.talkbank.org/access/Chinese/

• Class Chinese to access and download cross-sectional corpus data of Cantonese (HKU-70)
▪ https://childes.talkbank.org/access/Chinese/Cantonese/HKU.html
• Download longitudinal corpus data of Cantonese (Lee/Wong/Leung)
▪ https://childes.talkbank.org/access/Chinese/Cantonese/LeeWongLeung.html
• Locate the database files in c:/download/HKU or c:/download/LeeWongLeung

• Save and unzip the data files in c:/talkbank/CLAN/Work

5. Start CLAN (Windows version):


• Start the CLAN program

4
• A large “Output” window (Clan - [newfile.cha]) and a small “Commands” window

• “commands” window:
▪ has four buttons on the top left-hand corner and a corresponding directory
i. working (C:\talkbank\clan\work)
• to locate the "working" directory
• Click “working”
• Click “select directory”
ii. output
• to locate the output directory
• by default same as “working” directory
iii. lib (C:\talkbank\clan\lib)
iv. mor (C:\talkbank\clan\lib)

• “Output" window:
▪ It is where the outputs of CLAN analysis display

6. Language Sample Analysis: Commands in CLAN:

5
MLUm

a) Select the command MLUm to analyse the Mean Length of Utterance in morpheme
• click “Progs” and scroll down the commands button
• Select “MLU”, the “mlu” command will be shown on the command line

b) Select the transcript file for analysis


• Click the icon “File In” next by the “Progs” pull down manual button
• Select “HKU”under the “work” folder (c:\talkbank\CLAN\work\HKU)
• double click “4” (c:\talkbank\CLAN\work\HKU\4)
• select the first file “040021.cha” (c:\talkbank\CLAN\work\HKU\4\040021)
• click “add ->” button to choose the transcript 040021.cha
• the file 040021.cha shows in the small window on the right
• Click “done” button
• the “@” command will be shown right after the “mlu” in the command line

c) Select the data tiers for data analysis


• Click “Tiers” next by the “File In” icon
• Click “More choices” on the left bottom corner of the pop-up window
• Click “*CHI” to select only the utterances of the target child for data analysis

6
• Click “%mor” to select only the %mor tier for data analysis
• Click “OK”, and the “+t*CHI:” command will be shown on the window of the command line

d) Ready to run data analysis


• Click “Run” on the right bottom corner
• Output of the analysis will be shown on the larger “Output Window”

VOCD (VOCabulary Diversity)


a) Select the command VOCD to analyse the vocabulary diversity
• click “Progs” and pull down the commands button
• Select “vocd”, the “vocd” command will be shown on the command line

b) Select the transcript file for analysis


• Click the icon “File In” next by the “Progs” pull down manual button
• Select c:\talkbank\CLAN\work\4
• double click “4”
• select the first file “040021.cha”
• click “add ->” button to choose the transcript 040021.cha and it will be displayed on the
small window section on the right
• Click “done” button and the “@” command will be shown right after the “vocd” command
on the window of the command line

c) Select the data tiers for data analysis

7
• Click “Tiers” next by the “File In” icon
• Click “More choices” on the left bottom corner of the pop-up window
• Click “*CHI” to select only the utterances of the target child for data analysis
• Click “%mor” to select only the %mor tier for data analysis
• Click “OK”, and the “+t*CHI:” command will be shown on the window of the command line

e) Ready to run data analysis


• Click “Run” on the right bottom corner of the smaller “commands window”
• Output of the analysis will be shown on the larger “Output Window”

FREQ

a) Select the command FREQ to analyse the frequency of a target


• click “Progs” and pull down the commands button
• Select “freq”, the “freq” command will be shown on the command line

b) Select the transcript file for data analysis


• Click the icon “File In” next by the “Progs” pull down manual button
• Select c:\talkbank\CLAN\work\4
• double click “4”

8
• select the first file “040021.cha”
• click “add ->” button to choose the transcript 040021.cha and it will be displayed on the
small window section on the right
• Click “done” button and the “@” command will be shown right after the “freq” command on
the window of the command line

c) Select the data tiers for data analysis


• Click “Tiers” next by the “File In” icon
• Click “More choices” on the left bottom corner of the pop-up window
• Click “*CHI” to select only the utterances of the target child for data analysis
• Click “OK”, and the “+t*CHI:” command will be shown on the window of the command line
• Type +s"咗" (Option: to search for 咗)

f) Ready to run data analysis


• Click “Run” on the right bottom corner of the smaller “commands window”
• Output of the analysis will be shown on the larger “Output Window”

COMBO

d) Select the command COMB to search for specific combination of words


• click “Progs” and pull down the commands button
• Select “combo”, the “combo” command will be shown on the command line

9
e) Select the transcript file for data analysis
• Click the icon “File In” next by the “Progs” pull down manual button
• Select c:\talkbank\CLAN\work\4
• double click “4”
• select the first file “040021.cha”
• click “add ->” button to choose the transcript 040021.cha and it will be displayed on the
small window section on the right
• Click “done” button and the “@” command will be shown right after the “combo” command
on the window of the command line

f) Select the data tiers for data analysis


• Click “Tiers” next by the “File In” icon
• Click “More choices” on the left bottom corner of the pop-up window
• Click “*CHI” to select only the utterances of the target child for data analysis
• Click “OK”, and the “+t*CHI:” command will be shown on the window of the command line
• Type +s"*^咗" (*wild card: matches any string; ^ operator means “followed by”)

g) Ready to run data analysis


• Click “Run” on the right bottom corner of the smaller “commands window”
• Output of the analysis will be shown on the larger “Output Window”

10

You might also like