Computational Modeling of Cognition and Behavior Optimized PDF Download
Computational Modeling of Cognition and Behavior Optimized PDF Download
Visit the link below to download the full version of this book:
https://medipdf.com/product/computational-modeling-of-cognition-and-behavior/
www.cambridge.org
Information on this title: www.cambridge.org/9781107109995
DOI: 10.1017/9781316272503
© Simon Farrell and Stephan Lewandowsky 2018
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2018
Printed in the United Kingdom by Clays, St Ives plc
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Farrell, Simon, 1976– author. | Lewandowsky, Stephan, 1958– author.
Title: Computational modeling of cognition and behavior / Simon Farrell,
University of Western Australia, Perth, Stephan Lewandowsky, University of Bristol.
Description: New York, NY : Cambridge University Press, 2018.
Identifiers: LCCN 2017025806 | ISBN 9781107109995 (Hardback) | ISBN 9781107525610 (paperback)
Subjects: LCSH: Cognition–Mathematical models. | Psychology–Mathematical models.
Classification: LCC BF311 .F358 2018 | DDC 153.01/5118–dc23
LC record available at https://lccn.loc.gov/2017025806
ISBN 978-1-107-10999-5 Hardback
ISBN 978-1-107-52561-0 Paperback
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party Internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To Jodi, Alec, and Sylvie, with love (S.F.)
To Annie and the tribe (Ben, Rachel, Thomas, Jess, and Zachary) with love (S.L.)
Contents
1 Introduction 3
1.1 Models and Theories in Science 3
1.2 Quantitative Modeling in Cognition 6
1.2.1 Models and Data 6
1.2.2 Data Description 9
1.2.3 Cognitive Process Models 13
1.3 Potential Problems: Scope and Falsifiability 17
1.4 Modeling as a “Cognitive Aid” for the Scientist 20
1.5 In Vivo 22
vii
viii Contents
References 427
Index 455
Illustrations
1.1 An example of data that defy easy description and explanation without a
quantitative model. 4
1.2 The geocentric model of the solar system developed by Ptolemy. 5
1.3 Observed recognition scores as a function of observed classification confidence
for the same stimuli (each number identifies a unique stimulus). 7
1.4 Observed and predicted classification (left panel) and recognition (right panel). 8
1.5 Sample power law learning function (solid line) and alternative exponential
function (dashed line) fitted to the same data. 11
1.6 The representational assumptions underlying GCM. 14
1.7 The effects of distance on activation in the GCM. 15
1.8 Stimuli used in a classification experiment by Nosofsky (1991). 16
1.9 Four possible hypothetical relationships between theory and data involving
two measures of behavior (A and B). 19
2.1 Graphical illustration of a simple random-walk model. 25
2.2 Predicted decision-time distributions from the simple random-walk model
when the stimulus is non-informative. 31
2.3 Predicted decision-time distributions from the simple random-walk model
with a positive drift rate (set to 0.03 for this example). 32
2.4 Predicted decision-time distributions from the modified random-walk model
with a positive drift rate (set to 0.035 for this example) and trial-to-trial
variability in the starting point (set to 0.8). 35
2.5 Predicted decision-time distributions from the modified random-walk model
with a positive drift rate (set to 0.03 for this example) and trial-to-trial
variability in the drift rate (set to 0.025). 36
2.6 Overview of the family of sequential-sampling models. 38
2.7 The basic idea: We seek to connect model predictions to the data from our
experiment(s). 40
3.1 Data (plotting symbols) from Experiment 1 of Carpenter et al. (2008)
(test/study condition) with the best-fitting predictions (solid line) of a power
function. 48
3.2 An “error surface” for a linear regression model given by y = X b + e. 51
3.3 Two snapshots during parameter estimation of a simple regression line. 56
3.4 Two-dimensional projection of the error surface in Figure 3.2. 58
xiii
xiv List of Illustrations
3.5 Probability with which a worse fit is accepted during simulated annealing as a
function of the increase in discrepancy ( f ) and the temperature parameter (T). 63
3.6 The process of obtaining parameter estimates for bootstrap samples. 66
3.7 Histograms of parameter estimates obtained by the bootstrap procedure, where
data are generated from the model and the model is fit to the generated
bootstrap samples. 68
4.1 An example probability mass function: the probability of responding A to
exactly NA out of N=10 items in a categorization task, where the probability of
an A response to any particular item is PA = 0.7. 76
4.2 An example cumulative distribution function (CDF). 77
4.3 An example probability density function (PDF). 78
4.4 Reading off the probability of discrete data (top panel) or the probability
density for continuous data (bottom panel). 81
4.5 Distinguishing between probabilities and likelihoods. 83
4.6 The probability of a data point under the binomial model, as a function of the
model parameter PA and the data point NA , the number of A responses in a
categorization task. 84
4.7 Different ways of generating a predicted probability function, depending on
the nature of the model and the dependent variable. 91
4.8 The joint likelihood function for the Wald parameters m and a given the data set
t = [0.6 0.7 0.9]. 94
4.9 A likelihood function (left panel), and the corresponding log-likelihood
function (middle) and deviance function (−2 log likelihood; right panel). 97
4.10 A scatterplot between the individual data points (observed proportion A
responses for the 34 faces) and the predicted probabilities from GCM under
the maximum likelihood parameter estimates. 101
5.1 Simulated consequences of averaging of learning curves. 107
5.2 A simulated saccadic response time distribution from the gap task. 114
5.3 Left panel: Accuracy serial position function for immediate free recall of a
list of 12 words presented as four groups of three items. Right panel: Serial
position functions for three clusters of individuals identified using K-means
analysis. 118
5.4 The gap statistic for different values of k. 120
5.5 A structural equation model for choice RT. 122
6.1 Two illustrative Beta distributions obtained by the R code in Listing 6.1. 133
6.2 Bayesian prior and posterior distributions obtained by a slight modification of
the R code in Listing 6.1. 137
6.3 Jeffreys prior, Beta(0.5,0.5), for a Bernoulli process. 140
7.1 MCMC output obtained by Listing 7.1 for different parameter values. 150
7.2 MCMC output obtained by Listing 7.2 for different parameter values. 153
7.3 Experimental procedure for a visual working memory task in which
participants have to remember the color of a varying number of squares. 154
7.4 Data (circles) from a single subject in the color estimation experiment of
Zhang and Luck (2008) and fits of the mixture model (solid lines). 155
List of Illustrations xv
7.5 Posterior distributions of parameter estimates for g and σvM obtained when
fitting the mixture model to the data in Figure 7.4. 160
7.6 Overview of a simple Approximate Bayesian Computation (ABC) rejection
algorithm. 165
7.7 a. Data from an hypothetical recognition memory experiment in which people
respond “old” or “new” to test items that are old or new. b. Signal-detection
model of the data in panel a. 167
8.1 Illustration of a Gibbs sampler for a bivariate normal distribution. 174
8.2 Overview of how JAGS is being used from within R. 178
8.3 Output obtained from R using the plot command with an MCMC object
returned by the function coda.samples. 181
8.4 a. Data from an hypothetical recognition memory experiment in which people
respond “old” or “new” to test items that are old or new. b. Signal-detection
model of the data in panel a. 182
8.5 Output from JAGS for the signal detection model illustrated in Figure 8.4. 185
8.6 Convergence diagnostics for the JAGS signal detection model reported in
Figure 8.5. 186
8.7 The high-threshold (1HT) model of recognition memory expressed as a
multinomial processing tree model. 187
8.8 Output from JAGS for the high-threshold (1HT) model illustrated in Figure 8.7. 190
8.9 a. Autocorrelation pattern for the output shown in Figure 8.8. b. The same
autocorrelations after thinning. Only every fourth sample is considered during
each MCMC chain. 191
8.10 The no-conflict MPT model proposed by Wagenaar and Boer (1987) to account
for performance in the inconsistent-information condition in their experiment. 193
8.11 Output from a run of the no-conflict model for the data of Wagenaar and Boer
(1987) using Listings 8.8 and 8.9. 197
8.12 Example of a 95% highest density interval (HDI). 199
8.13 Diagram of the normal model, in the style of the book, Doing Bayesian Data
Analysis (Kruschke, 2015). 201
8.14 Diagram of the normal model, in the style of conventional graphical models. 202
9.1 Graphical model for the signal-detection example from Section 8.3.1. 205
9.2 Graphical model for a signal-detection model that is applied to a number of
different conditions or participants. 206
9.3 Graphical model for a signal-detection model that is applied to a number of
different conditions or participants. 207
9.4 Hierarchical estimates of individual hit rates (left panel) and false alarm
rates (right) shown as a function of the corresponding individual frequentist
estimates for the data in Table 9.2. 210
9.5 Graphical model for a hierarchical model of memory retention. 213
9.6 Results of a run of the hierarchical exponential forgetting model defined in
Listings 9.3 and 9.4. 216
9.7 Posterior densities of the parameters a, b, and α of the hierarchical exponential
forgetting model defined in Listings 9.3 and 9.4. 216
xvi List of Illustrations
9.8 Results of a run of the hierarchical power forgetting model defined in Listings
9.5 and 9.6. 218
9.9 Graphical model for a hierarchical model of intertemporal choice. 221
9.10 Data from 15 participants of an intertemporal choice experiment reported by
Vincent (2016). 223
9.11 Snippet of the data file from the experiment by Vincent (2016) that is used by
the R script in Listing 9.8. 224
9.12 Predictions of the hierarchical intertemporal choice model for the experimental
conditions explored by Vincent (2016). 226
9.13 Posterior densities for the parameters of the hierarchical intertemporal choice
model when it is applied to the experimental conditions explored by Vincent
(2016). 227
10.1 Fits of the polynomial law of sensation to noisy data generated from a
logarithmic function. 242
10.2 Predictions from a polynomial function of order 2 (left panel) and order 10
(right panel), with randomly sampled parameter values. 243
10.3 An illustration of the bias-variance trade-off. 246
10.4 The bias-variance trade-off. As model complexity (the order of the fitted
polynomial) increases, bias decreases and variance increases. 247
10.5 Out-of-set prediction error. 248
10.6 The two functions underlying prospect theory. 251
10.7 K-L distance is a function of models and their parameters. 257
10.8 Prior probability (solid horizontal line) and posterior probabilities (lines
labeled β and ) for two parameters in a multinomial tree model that are
“posterior-probabilistically-identified.” 267
11.1 Illustration of how the marginal likelihood can implement the principle of
parsimony. 275
11.2 Illustration of the Savage-Dickey density ratio for the signal detection model,
examining whether b = 0. 286
11.3 Autocorrelations in samples of the model indicator pM2 using noninformative
pseudo-priors (left panel) and pseudo-priors approximating the posterior (right
panel). 294
11.4 Predicted hit and false alarm rates in a change-detection task derived using
non-informative (left-hand quadrants) and informative (right-hand quadrants)
prior distributions for two models of visual working memory. 305
12.1 A flowchart of modeling. 312
12.2 The effect of the response suppression parameter η in Lewandowsky’s (1999)
connectionist model of serial recall. 314
12.3 A schematic depiction of sufficiency and necessity. 319
13.1 Architecture of a Hebbian model of associative memory. 335
13.2 Different ways of representing information in a connectionist model. 336
13.3 Schematic depiction of the calculation of an outer product W between two
vectors o and c. 342
13.4 Generalization in the Hebbian model. 346