0% found this document useful (0 votes)
7 views2 pages

Babies Learning Language - Methods (07-08)

The document discusses the creation of MetaLab, a website that compiles up-to-date meta-analyses in developmental psychology, making it easier for researchers to access and explore data from over 16,000 infants. It also highlights the importance of open science practices, emphasizing that while they may not inherently make studies interesting, their absence can undermine a study's value. Additionally, it introduces childes-db, a flexible interface for the CHILDES database, aimed at improving accessibility and reproducibility in language development research.

Uploaded by

Luis Luengo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views2 pages

Babies Learning Language - Methods (07-08)

The document discusses the creation of MetaLab, a website that compiles up-to-date meta-analyses in developmental psychology, making it easier for researchers to access and explore data from over 16,000 infants. It also highlights the importance of open science practices, emphasizing that while they may not inherently make studies interesting, their absence can undermine a study's value. Additionally, it introduces childes-db, a flexible interface for the CHILDES database, aimed at improving accessibility and reproducibility in language development research.

Uploaded by

Luis Luengo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

An example age-moderation relationship for studies of mutual exclusivity in early word

learning.

Meta-analyses can be immensely informative – yet they are rarely used by researchers. One
reason may be because it takes a bit of training to carry them out or even understand
them. Additionally, MAs go out of date as new studies are published.

To facilitate developmental researchers’ access to up-to-date meta-analyses, we created


MetaLab. MetaLab is a website that compiles MAs of phenomena in developmental
psychology. The site has grown over the last two years from just a small handful of MAs to
15 at present, with data from more than 16,000 infants. The data from each MA are stored
in a standardized format, allowing them to be downloaded, browsed, and explored using
interactive visualizations. Because all analyses are dynamic, curators or interested users
can add new data as the literature expands.
Read more »

Posted by Michael Frank at 10:43 AM No comments:

Labels: Development, Methods, Reproducibility


T h u r s d a y, D e c e m b e r 7 , 2 0 1 7

Open science is not inherently interesting. Do it


anyway.
tl;dr: Open science practices themselves don't make a study interesting. They are essential
prerequisites whose absence can undermine a study's value.

There's a tension in discussions of open science, one that is also mirrored in my own
research. What I really care about are the big questions of cognitive science: what makes
people smart? how does language emerge? how do children develop? But in practice I spend
quite a bit of my time doing meta-research on reproducibility and replicability. I often hear
critics of open science – focusing on replication, but also other practices – objecting that
open science advocates are making science more boring and decreasing the focus on
theoretical progress (e.g., Locke, Strobe & Strack). The thing is, I don't completely
disagree. Open science is not inherently interesting.

Sometimes someone will tell me about a study and start the description by saying that it's
pre-registered, with open materials and data. My initial response is "ho hum." I don't really
care if a study is preregistered – unless I care about the study itself and suspect p-hacking.
Then the only thing that can rescue the study is preregistration. Otherwise, I don't care
about the study any more; I'm just frustrated by the wasted opportunity.

So here's the thing: Although being open can't make your study interesting, the failure to
pursue open science practices can undermine the value of a study. This post is an attempt
to justify this idea by giving an informal Bayesian analysis of what makes a study
interesting and why transparency and openness is then the key to maximizing study value.

Read more »

Posted by Michael Frank at 8:40 AM No comments:

Labels: Methods, Reproducibility

F r i d a y, N o v e m b e r 1 0 , 2 0 1 7

Talk on reproducibility and meta-science


I just gave a talk at UCSD on reproducibility and meta-science issues. The slides are posted
here. I focused somewhat on developmental psychology, but a number of the studies and
recommendations are more general. It was lots of fun to chat with students and faculty,
and many of my conversations focused on practical steps that people can take to move
their research practice towards a more open, reproducible, and replicable workflow. Here
are a few pointers:

Preregistration. Here's a blogpost from last year on my lab's decision to preregister


everything. I also really like Nosek et al's Preregistration Revolution paper.
AsPredicted.org is a great gateway to simple preregistration (guide).

Reproducible research. Here's a blogpost on why I advocate for using RMarkdown to write
papers. The best package for doing this is papaja (pronounced "papaya"). If you don't use
RMarkdown but do know R, here's a tutorial.

Data sharing. Just post it. The Open Science Framework is an obvious choice for file
sharing. Some nice video tutorials make an easy way to get started.

Posted by Michael Frank at 11:29 AM No comments:


Labels: Development, Methods, Reproducibility
S u n d a y, N o v e m b e r 5 , 2 0 1 7

Co-work, not homework


Coordination is one of the biggest challenges of academic collaborations. You have two or
more busy collaborators, working asynchronously on a project. Either the collaboration
ping-pongs back and forth with quick responses but limited opportunity for deeper
engagement or else one person digs in and really makes conceptual progress, but then has
to wait an excruciating amount of time for collaborators to get engaged, understand the
contribution, and respond themselves. What's more, there are major inefficiencies caused
by having to load up the project back into memory each time you begin again. ("What was
it we were trying to do here?")

The "homework" model in collaborative projects is sometimes necessary, but often


inefficient. This default means that we meet to discuss and make decisions, then assign
"homework" based on that discussion and make a meeting to review the work and make a
further plan. The time increments of these meetings are usually 60 minutes, with the
additional email overhead for scheduling. Given the amount of time I and the collaborators
will actually spend on the homework the ratio of actual work time to meetings is
sometimes not much better than 2:1 if there are many decisions to be made on a project –
as in design, analytic, and writeup stages.* Of course if an individual has to do data
collection or other time-consuming tasks between meetings, this model doesn't hold!

Increasingly, my solution is co-work. The idea is that collaborators schedule time to sit
together and do the work – typically writing code or prose, occasionally making stimuli or
other materials – either in person or online. This model means that when conceptual or
presentational issues come up we can chat about them as they arise, rather than waiting to
resolve them by email or in a subsequent meeting.** As a supervisor, I love this model
because I get to see how the folks I work with are approaching a problem and what their
typical workflow is. This observation can help me give process-level feedback as I learn
how people organize their projects. I also often learn new coding tricks this way.***

Read more »

Posted by Michael Frank at 3:36 PM 1 comment:

Labels: Methods

F r i d a y, O c t o b e r 6 , 2 0 1 7

Introducing childes-db: a flexible and reproducible


interface to CHILDES
Note: childes-db is a project that is a collaboration between Alessandro Sanchez, Stephan
Meylan, Mika Braginsky, Kyle MacDonald, Dan Yurovsky, and me; this blogpost was written
jointly by the group.

For those of us who study child development – and especially language development – the
Child Language Data Exchange System (CHILDES) is probably the single most important
resource in the field. CHILDES is a corpus of transcripts of children, often talking with a
parent or an experimenter, and it includes data from dozens of languages and hundreds of
children. It’s a goldmine. CHILDES has also been around since way before the age of “big
data”: it started with Brian MacWhinney and Catherine Snow photocopying transcripts (and
then later running OCR to digitize them!). The field of language acquisition has been a
leader in open data sharing largely thanks to Brian’s continued work on CHILDES.

Despite these strengths, using CHILDES can sometimes be challenging, especially for the
most casual or most in-depth interactions. Simple analyses like estimating word
frequencies can be done using CLAN – the major interface to the corpora – but these
require more comfort with command-line interfaces and programming than can be
expected in many classroom settings. On the other end of the spectrum, many of us who
use CHILDES for in-depth computational studies like to read in the entire database, parse
out many of the rich annotations, and get a set of flat text files. But doing this parsing
correctly is complicated, and often small decisions in the data-processing pipeline can lead
to different downstream results. Further, it can be very difficult to reconstruct a particular
data prep in order to do a replication study. We've been frustrated several times when
trying to reproduce others' modeling results on CHILDES, not knowing whether our
implementation of their model was wrong or whether we were simply parsing the data
differently.

To address these issues and generally promote the use of CHILDES in a broader set of
research and education contexts, we’re introducing a project called childes-db. childes-db
aims to provide both a visualization interface for common analyses and an application
programming interface (API) for more in-depth investigation. For casual users, you can
explore the data with Shiny apps, browser-based interactive graphs that supplement
CHILDES’s online transcript browser. For more intensive users, you can get direct access to
pre-parsed text data using our API: an R package called childesr, which allows users to
subset the corpora and get processed text. The backend of all of this is a MySQL database
that’s populated using a publicly-available – and hopefully definitive – CHILDES parser, to
avoid some of the issues caused by different processing pipelines.

Read more »

Posted by Michael Frank at 9:05 AM 6 comments:

Labels: Cognitive Science, Development, Methods, Reproducibility

T h u r s d a y, J u n e 1 , 2 0 1 7

Confessions of an Associate Editor

You might also like