The Semantic Spaces of Child-Directed Speech, Child Speech and Adult-Directed Speech: A Manifold Perspective
The Semantic Spaces of Child-Directed Speech, Child Speech and Adult-Directed Speech: A Manifold Perspective
The Semantic Spaces of Child-Directed Speech, Child Speech and Adult-Directed Speech: A Manifold Perspective
Results
The general trend is that the highest unlabeled precisions are
found in the upper right corners of the contour maps whereas
the lowest unlabeled precisions tend to lie close to the x-axis.
The dimensionality of the embedding space can be
interpreted as the granularity of children’s representations.
The result of the alignments is demonstrated graphically in
Figures 1 and 2. In the alignments from CS to CDS and CS
to COCA, the CS-COCA alignment achieves only 50% to
60% of the unlabeled precision of the CS-CDS alignment. Figure 2 Unlabeled accuracies of CS-CDS and CS-COCA
The unlabeled precision of the CS-CDS alignment is alignments with a random alignment as the baseline
consistently higher than the unlabeled precision of the CS-
COCA alignment across all conditions. Both alignments have The degree of a vertex measures the association between a
much larger unlabeled accuracy than the random baseline. vertex and its neighboring vertices. The prediction is that
The CS data are aligned to both the spoken COCA and vertices with large degree are better labeled data than vertices
CDS corpora. The CS-CDS alignment precision wins over with small degree. Cognitively, the verbs with high degree
the CS-COCA precision across all conditions. In other words, are semantically general verbs whereas the verbs with low
child speech is much easier to map to child-directed speech degree are the ones with less general meanings.
than to spoken COCA. This easier alignment can be
interpreted as similarity in semantic spaces across corpora. Method
Since the CS and the CDS word vectors are trained on Verbs are ranked based on their vertex degree in a semantic
speech data from different experiments, the relative similarity network. As shown in Table 1, what we use as labeled data is
between CS and CDS lexical semantics, this similarity does 100 verbs with the largest degrees, 100 with the smallest
not reflect mere priming effects. There are two possible degrees, and medium-degree verbs with degree rank of 201
interpretations for this result. First, the result can be viewed to 300. We also mixed half of high degree verbs with half of
as an imitation effect in which children mirror child-directed medium degree verbs in the mixed condition. The baseline
speech semantically. Second, adult caregivers might adapt condition is averaged over 5 random initializations. We set
their mental representations to children’s when they talk to
the number of mutual nearest neighbors, the evaluation
radius and the dimensionality all to 20.
Results
The alignment precisions shown in Figure 3 show a clear
advantage of high-degree and medium degree conditions
over the low degree condition, but both high-degree and low-
degree have below random performances. We can also see an
advantage of medium degree initialization, which is parallel
to the basic level categorization theories. When we use a
mixed set of high-degree and medium-degree verbs, we get
the best results on all the conditions, which suggests that a
diverse-degree initialization facilitates semantic space
alignment.
Figure 3 Unlabeled accuracies of alignments with high-
Table 1: Verbs with the largest, medium and smallest degree, medium-degree, low degree, mixed-degree and
vertex degrees in ADS random initializations