0% found this document useful (0 votes)

26 views17 pages

International Journal of Forecasting: Emre Soyer Robin M. Hogarth

Uploaded by

J. Fernando G. R.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views17 pages

International Journal of Forecasting: Emre Soyer Robin M. Hogarth

Uploaded by

J. Fernando G. R.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

International Journal of Forecasting 28 (2012) 695–711

Contents lists available at SciVerse ScienceDirect

International Journal of Forecasting

journal homepage: www.elsevier.com/locate/ijforecast

The illusion of predictability: How regression statistics mislead experts

Emre Soyer a , Robin M. Hogarth a,b,∗
a
Universitat Pompeu Fabra, Department of Economics & Business, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
b
ICREA, Barcelona, Spain

article info abstract

Keywords: Does the manner in which results are presented in empirical studies affect perceptions
Regression
of the predictability of the outcomes? Noting the predominant role of linear regression
Presentation formats
analysis in empirical economics, we asked 257 academic economists to make probabilistic
Probabilistic inference
Prediction inferences based on different presentations of the outputs of this statistical tool. The
Graphics questions concerned the distribution of the dependent variable, conditional on known
Uncertainty values of the independent variable. The answers based on the presentation mode that is
standard in the literature demonstrated an illusion of predictability; the outcomes were
perceived to be more predictable than could be justified by the model. In particular, many
respondents failed to take the error term into account. Adding graphs did not improve
the inference. Paradoxically, the respondents were more accurate when only graphs were
provided (i.e., no regression statistics). The implications of our study suggest, inter alia,
the need to reconsider the way in which empirical results are presented, and the possible
provision of easy-to-use simulation tools that would enable readers of empirical papers to
make accurate inferences.
© 2012 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

1. Introduction between the independent and dependent variables. It is

also essential to acknowledge the level of uncertainty in-
Much academic research in empirical economics inherent in outcomes of the dependent variable, conditional
volves determining whether or not one or several variables on values of the independent variable. For example, con-
have causal effects on another variable. The statistical tool sider a decision maker who is pondering which actions to
used for making such affirmations is typically regression take and how much to do so in order to reach a certain goal.
analysis, where the terms ‘‘independent’’ and ‘‘dependent’’ This requires conjectures to be formed about the individ-
are used to distinguish cause(s) from outcomes. The results ual outcomes that would result from specific inputs. More-
from most analyses consist of statements as to whether over, the answers to these questions depend not only on
or not particular independent variables are ‘‘significant’’ estimating average effects, but also on the distribution of
in affecting outcomes (the dependent variable), and most
possible effects around the average.
discussions of the importance of such variables focus on
In this paper, we argue that the emphasis placed on
the ‘‘average’’ effects on outcomes of possible changes in
determining average causal effects in the economics liter-
inputs.
However, if the analysis is used for prediction, em- ature limits our ability to make correct probabilistic fore-
phasizing only statistically significant average effects re- casts. In particular, the way in which results are presented
sults in an incomplete characterization of the relationship in regression analyses obfuscates the uncertainty inherent
in the dependent variable. As a consequence, consumers of
the economic literature can be subject to what we call the
∗ Corresponding author at: Universitat Pompeu Fabra, Department of
‘‘illusion of predictability’’.
Economics & Business, Ramon Trias Fargas 25-27, 08005, Barcelona,
Spain. Tel.: +34 93 5422561; fax: +34 93 5421746.
Whereas it can be argued that the way in which in-
E-mail addresses: [email protected] (E. Soyer), formation is presented should not affect rational inter-
[email protected] (R.M. Hogarth). pretation and analysis, there is abundant psychological
0169-2070/$ – see front matter © 2012 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
doi:10.1016/j.ijforecast.2012.02.002
696 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

evidence demonstrating that such presentation effects do to improve inferences. However, presenting the results
occur. Many studies have shown, for example, the way in in graphical fashion alone improved the accuracy. The
which subtle changes in questions designed to elicit prefer- implications of our findings, including suggested ways of
ences are subject to contextual influences (see, e.g., Kahne- improving statistical reporting, are discussed in Section 5.
man & Tversky, 1979). Moreover, these have been reported
in both controlled laboratory conditions and field stud- 2. Current practice
ies involving appropriately motivated experts (Camerer,
2000; Thaler & Sunstein, 2008). The human information There are many sources of empirical analyses in
processing capacity is limited, and the manner in which economics. In order to obtain a representative sample of
attention is allocated has important implications for both current practice, we selected all of the articles published
revealed preferences and inferences (Simon, 1978). in the 3rd issues (of each year) of four leading journals
Recently, Gigerenzer and his colleagues (Gigerenzer, between 1998 and 2007 (441 articles). The journals were
Gaissmaier, Kurz-Milcke, Schwartz, & Woloshin, 2007) American Economic Review (AER), Quarterly Journal of
reviewed research on how probabilities and statistical in- Economics (QJE), Review of Economic Studies (RES) and
formation are presented, and consequently perceived, by Journal of Political Economy (JPE). Among these articles,
individuals or specific groups that use them frequently in we excluded those with time series analyses, and only
their decisions. They show that mistakes in probabilistic included those with cross-sectional analyses where the
reasoning and the miscommunication of statistical infor- authors identify one or more independent variables as
mation are common. Their work focuses mainly on the statistically significant causes of relevant economic and
fields of medicine and law, where doctors, lawyers and social outcomes. Our aim is to determine how the
judges fail to communicate crucial statistical information consumers of this literature translate the findings about
appropriately in particular situations, thereby leading to average causal effects into perceptions of predictability.
biased judgments that have a negative impact on others. Many of the articles published in these journals are em-
One such example is the failure of gynecologists to infer pirical. Over 70% of the empirical analyses use variations
the probability of cancer correctly, given the way in which of regression analysis, of which 75% have linear specifi-
mammography results are communicated. cations. Regression analysis is clearly the most prominent
We examine the way in which economists communi- tool used by economists to test hypotheses and identify re-
cate statistical information. Specifically, we note that much lationships among economic and social variables.
of the work in empirical economics involves the estima- In economics journals, empirical studies follow a
tion of average causal effects through the technique of common procedure for displaying and evaluating results.
regression analysis. However, when we asked a large sam- Typically, authors provide a table that displays the
ple of economists to use the standard reported outputs of descriptive statistics of the sample used in the analysis.
the simplest form of regression analysis to make proba- Either before or after this display, they describe the
bilistic forecasts for decision making purposes, nearly 70% specification of the model on which the analysis is based,
of them experienced difficulties. The reason for this, we then provide the regression results in detailed tables. In
believe, is that current reporting practices focus attention most cases, these results include the coefficient estimates
on the uncertainty surrounding the model parameter esti- and their standard errors, along with other frequently
mates, and fail to highlight the uncertainty concerning out- reported statistics, such as the number of observations and
comes of the dependent variable conditional on the model the R2 values.
identified. On the other hand, when attention was directed Table 1 summarizes these details for the sample of
appropriately – by graphical as opposed to tabular means studies referred to above. It shows that, apart from
– over 90% of our respondents made accurate inferences. the regression coefficients and their standard errors (or
In the next section, we provide some background on t-statistics), there is not much agreement as to what
the practice and evolution of reporting empirical results in else should be reported. The data therefore suggest that
economics journals. In Section 3 we provide information economists probably understand the inferences that can
concerning the survey we conducted with economists, be made about regression coefficients or the average
which involved them answering four decision-oriented impact of manipulating an independent variable quite
questions based on a standard format for reporting the well; however, their ability to make inferences about other
results of regression analyses. We employed six different probabilistic implications may be less well developed (e.g.,
conditions designed to assess the differential effects due to predicting individual outcomes conditional on specific
model fit (R2 ) and different forms of graphical presentation inputs).
(with and without accompanying statistics). In Section 4, It is not clear when, how, or why the above manner
we present our results. In brief, our study shows that of presenting regression results in publications emerged.
the typical presentation format of econometric models No procedure is ever explicitly stated in the submission
and results – one based mainly on regression coefficients guidelines for the highly ranked journals. Moreover,
and their standard errors – leads economists to ignore popular econometric textbooks, such as those of Greene
the level of predictive uncertainty implied by the model (2003), Gujarati and Porter (2009) and Judge, Griffiths, Hill,
and captured by the standard deviation of the estimated and Lee (1985) do not explain specifically how to present
residuals. As a consequence, there is a considerable results or how to use them for decision making. Hendry
illusion of predictability. Adding graphs to the standard and Nielsen (2007) address issues regarding prediction
presentation of coefficients and standard errors does little in more detail than other similar textbooks. Another
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 697

Table 1
Distribution of types of statistics provided by studies in our sample of economics journals.
Studies that: Journals % of total
AER QJE JPE RES Total

. . . use linear regression analysis 42 41 15 13 111 x

. . . provide both the sample standard deviation of the dependent variable(s) and the R2 statistic 16 27 11 12 66 59
. . . provide R2 statistics 30 32 15 12 89 80
. . . provide the sample standard deviation of the dependent variable(s) 21 32 11 13 77 69
. . . provide the estimated constant, along with its standard error 19 14 4 1 38 34
. . . provide a scatter plot 19 16 5 2 42 38
. . . provide the standard error of the regression (SER) 5 3 1 1 10 9

exception is Wooldridge (2008), who dedicates several statistics can convey a message to readers about the
sections to presentation issues. His outline suggests that a level of uncertainty in the results. These are R2 and the
good summary consists of a table with selected coefficient Standard Error of the Regression (SER).1 As a bounded and
estimates and their standard errors, R2 statistics, a standardized quantity, R2 describes the fit of a model. SER,
constant, and the numbers of observations. Indeed, this on the other hand, provides information on the degree of
is consistent with today’s practice. More than 60% of the predictability in the metric of the dependent variable.
articles in Table 1 follow a similar procedure. Table 1 shows that SER is practically never given in
Zellner (1984) conducted a survey of statistical practice the presentation of results: less than 10% of the studies
based on articles published in 1978 in the AER, JPE, with linear specifications provide it. R2 is the prevalent
International Economic Review, Journal of Econometrics and statistic reported to give an indication of model fit. This
Econometrica. He documented confusion as to the meaning is the case for 80% of published articles with a linear
of tests of significance, and proposed Bayesian methods for specification. Table 1 also shows that more than 40% of the
overcoming theoretical and practical problems. Similarly, publications in our sample that utilize a linear regression
McCloskey and Ziliak (1996) provided an illuminating analysis (excluding studies that base their main results on
study of statistical practice based on articles published an IV regression) provide no information on either R2 or
in AER in the 1980s. They demonstrated that there was the standard deviation of the dependent variable. Hence,
widespread confusion in the interpretation of statistical a decision maker consulting the results of these studies
results, due to a confounding of the concepts of statistical cannot infer much about either the unexplained variance
and economic or substantive significance. Too many within the dependent variable or the cloud of data points to
results depended on whether the t- or other statistics which the regression line is fitted. Alternatively, a scatter
exceeded arbitrarily defined limits. In follow-up studies, plot would be essential in order to indicate the degree of
Ziliak and McCloskey (2004, 2008) report that, if anything, uncertainty. However, less than 40% of the publications in
this situation worsened in the 1990s (see also Zellner, our sample provide a graph with actual observations.
2004). Given the prevalence of empirical analyses and their
Empirical finance has developed an illuminating way potential use for decision making and prediction, debates
of determining the significance of findings. In this field, about how to present results are important. However, it is
once statistical analysis has identified a variable as being important that such debates be informed by evidence as to
‘‘important’’ in affecting, say, stock returns, it is standard to the way in which knowledgeable individuals use currently
assess ‘‘how important’’ it is by evaluating the performance available tools for making probabilistic inferences, and
of simulated stock portfolios that use the variable (see, e.g., the way in which different presentation formats affect
Carhart, 1997, and Jensen, 1968). judgment. Our goal is to provide such evidence.
In psychology, augmenting significance tests with the
effect size became common practice in the 1980s. For ex- 3. The survey
ample, in its submission guidelines, Psychological Science,
the flagship journal of the Association for Psychological
3.1. Goal and design
Science, explicitly states, ‘‘effect sizes should accompany
major results. When relevant, bar and line graphs should
How do knowledgeable individuals (economists) inter-
include distributional information, usually confidence in-
pret specific decision making implications of the standard
tervals or standard errors of the mean’’.
output of a regression analysis? To find out, we used the
In forecasting, Armstrong (2007) initiated a discussion
following criteria to select the survey questions. First, we
on not only the necessity of using effect size measures
provided information about a well-specified model that
when identifying relationships among variables, but also
strictly met the underlying assumptions of linear regres-
the fact that significance tests should be avoided when
sion analysis. Second, the model was straightforward, in
doing so. He argues that the results of significance
tests are often misinterpreted, and even when presented
and interpreted correctly, they do not contribute to the
1 Some sources refer to SER as the Standard Error of Estimates, or SEE
decision making process. Schwab and Starbuck (2009)
(see RATS), while others refer to it as the root Mean Squared Error or root-
make an analogous argument for management science. MSE (see STATA). Wooldridge (2008) uses the term Standard Error of the
In interpreting the results of linear regression analysis Regression (SER), defining it as ‘‘an estimator of the standard deviation of
from a decision making and predictive perspective, two the error term’’.
698 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

that it had only one independent variable. Third, all of the 2. What minimum, positive value of X would make sure,
information necessary for solving the problems posed was with 95% probability, that the individual obtains more
available from the output provided. Fourth, although suffi- Y than a person who has X = 0?
cient information was available, respondents had to apply 3. Given that the 95% confidence interval for β is (0.936,
knowledge about statistical inference in order to make the 1.067), if an individual has X = 1, what would be the
calculations necessary for answering the questions. probability that s/he gets Y > 0.936?
This last criterion is the most demanding, because 4. If an individual has X = 1, what would be the
whereas economists may be used to interpreting the probability that s/he gets Y > 1.001 (i.e. the point
statistical significance of regression coefficients, they estimate)?
typically do not assess the uncertainties involved in The questions for Conditions 2, 4, and 6 were the same,
prediction when an independent variable is changed or except that the confidence interval for β is (0.911, 1.130),
manipulated (apart from making ‘‘on average’’ statements and we ask about the probabilities of obtaining Y >
that give no hint as to the distribution around the average). 0.911 and Y > 1.02, given X = 1, in questions 3 and
Our study required respondents to answer four decision 4 respectively. All four questions are reasonable, in that
making questions, after being provided with information they seek answers to questions that would be of interest
about a correctly specified regression analysis. There were to decision makers. However, they are not the types of
six different conditions, which varied in the overall fit of questions that reports in economics journals usually lead
the regression model (Conditions 1, 3, and 5 with R2 = readers to pose, and thus, they test a respondent’s ability to
0.50, the others with R2 = 0.25), as well as in the amount reason in a correct statistical manner given the information
and type of information provided. Figs. 1 and 2 report the provided. In Appendix A, we provide the rationale behind
information provided to the respondents for Conditions 1 the questions and the correct answers.
and 2, which is similar in form and content to the outputs
of many reports in the economic literature (and consistent 3.3. Respondents and method
with Wooldridge, 2008). Conditions 3 and 4 used the
same tables, but provided the bivariate scatter-plots of the We sent web-based surveys to faculty members in
dependent and independent variables in addition to the economics departments at leading universities worldwide.
standard deviation of the estimated residuals—see Figs. 3 From the top 150 departments, ranked by numbers of
and 4. In Conditions 5 and 6, the statistical outputs of the econometric publications between 1989 and 2005 (Baltagi,
regression analyses were not provided, but the bivariate 2007, Table 3), we randomly selected 113.3 Within each
graphs of the dependent and independent variables were, department, we randomly selected up to 36 faculty
as in Figs. 3 and 4.2 In other words, for these two conditions members. We ordered them alphabetically by their names
we were intrigued by what would happen if respondents and assigned Condition 1 to the first person, Condition 2 to
were limited to only consulting graphs. the second person,. . . , Condition 6 to the sixth person, then
Similarly to our survey on current practice in Section 2, again Condition 1 to the seventh person, and so on.
we again restrict our attention to cross-sectional analy- We conducted the survey online by personally sending
ses in our experimental conditions. We are primarily con- a link for the survey, along with a short explanation, to the
cerned with determining the way in which findings on professional email address of each prospective participant.
average causal effects are used for predictions and decision In this way, we managed to keep the survey strictly
making. Our variations over different conditions would not anonymous. We do know the large pool of institutions
be valid for time series studies, where the R2 statistic does to which the participants belong, but have no means of
not provide information on the model fit. It is important identifying the individual sources of the answers. The
to add that results are also discussed in the text in pub- participants answered the survey voluntarily. They had
lished papers. These discussions, which are mostly con- no time constraints and were allowed to use calculators
fined to certain coefficient estimates and their statistical or computers if they wished. We told all prospective
significance levels, might distract decision makers from participants that, at the completion of the research, the
the uncertainties about outcomes. None of our conditions study along with the feedback on questions and answers
involve such discussions. would be posted on the web and that they would be
notified,4 but did not offer them any economic incentives
for participation.
3.2. Questions
As can be seen from Table 2, we dispatched a total of
3013 requests to participate. About one-quarter of poten-
For Conditions 1, 3, and 5, we asked the following
tial respondents (26%) opened the survey and, we presume,
questions:

1. What would be the minimum value of X that an

individual would need to make sure that s/he obtains 3 We stopped sampling universities once we had at least 30 individual
a positive outcome (Y > 0) with 95% probability? responses for each question asked. A few universities were not included
in our sample because their webpages did not facilitate access to potential
respondents. This was more frequent for non-US universities. For reasons
of confidentiality, we do not identify any of these universities.
2 We thank Rosemarie Nagel for suggesting that we include Conditions 4 In fact, this was done right after a first draft of the paper had been
5 and 6. written.
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 699

Fig. 1. Presentation of Condition 1. This mimics the methodology of 60% of the publications that were surveyed, and also the suggestions of Wooldridge
(2008).

looked at the set-ups and questions. About a third of these econometricians, and more than two-thirds (77%) used
(or 9% of all potential respondents) actually completed regression analysis in their work (41% ‘‘often’’ or ‘‘always’’).
the survey. The proportions of potential respondents who
opened the surveys and responded was highest for Condi-
4. Results
tions 5 and 6 (40%), as opposed to the 30% and 32% in Con-
ditions 1 and 2, and 3 and 4, respectively. The average time
taken to complete the survey was also lowest for Condi- 4.1. Condition 1
tions 5 and 6 (see the notes to Table 2). We consider these
outcomes again when we discuss the results below. The respondents’ answers to Condition 1 are summa-
Table 2 documents characteristics of our respondents. rized in Fig. 5. Three answers were removed from the data,
In terms of position, the majority (59%) are at the rank being only ‘‘I don’t know’’, or ‘‘?’’. For the first two ques-
of Associate Professor or higher. They also work in a tions, responses within ±5 of the correct amount were
wide variety of fields within the economics profession. considered correct. For questions 3 and 4, we considered
Thirteen percent of respondents classified themselves as correct any responses that were within ±5% of the answer.
700 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

We also regarded as correct the responses of four partici-

pants who did not provide numerical estimates, but men-
tioned that the answer was related mainly to the error term
and its variance (there were 21 such responses across all
conditions). The questions and the correct answers are dis-
played in the titles of the histograms in Fig. 5.
Most answers to the first three questions were in-
correct. They suggest that the presentation leads to the
respondents only evaluating the results through the coef-
ficient estimates, and obscures the uncertainty implicit in
the dependent variable. Specifically, Fig. 5 shows that:
1. 72% of the participants believe that for an individual to
obtain a positive outcome with 95% probability, a small
X (X < 10) would be enough, given the regression
results. The majority state that any small positive value
of X would be sufficient to obtain a positive outcome
with 95% probability. In actual fact, in order to obtain
a positive outcome with 95% probability, a decision
maker should choose approximately X = 47.
2. 71% of the answers to the second question suggest that
for an individual to be better off than another person
Fig. 2. Tables in Condition 2. The rest of the presentation is the same as with X = 0, with 95% probability, a small value of
Fig. 1. X (X < 10) would be sufficient. In fact, given that the

Fig. 3. Bivariate scatter plot of Condition 1 and information on SER. Both were provided to participants in Condition 3, along with the estimation results.
Only the graph was provided in Condition 5.

Fig. 4. Bivariate scatter plot of Condition 2 and information on SER. Both were provided to participants in Condition 4, along with estimation results. Only
the graph was provided in Condition 6.
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 701

Fig. 5. Histograms for the responses to Condition 1. The top-left figure shows answers to question 1, the one on the top-right shows answers to question
2, the one on the bottom-left those to question 3, and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 39, 35, 45 and 44 responses to questions 1–4, respectively.

person with X = 0 will also be subject to a random Y > 0.936 is almost certain. Incidentally, the high rate
shock, the value of X needed to ensure this condition of correct answers to question 4 suggests that the failure
is approximately 67. to respond accurately to questions 1–3 was not because
3. 60% of the participants suggest that, given X = 1, participants failed to pay attention to the task (i.e., they
the probability of obtaining an outcome that is above were not responding ‘‘randomly’’).
the lower bound of the estimated coefficient’s 95% Our findings echo those of Lawrence and Makridakis
confidence interval is very high (greater than 80%). (1989), who showed in an experiment that decision
Instead, the correct probability is approximately 51%, makers tend to construct confidence intervals of forecasts
as the uncertainty around the coefficient estimates in using estimated coefficients, and fail to correctly take into
this case is small compared to the uncertainty due to account the randomness inherent in the process they are
the error term. evaluating. Our results are also consistent with those of
4. 84% of participants gave an approximately correct Goldstein and Taleb (2007), who showed that failing to
answer of 50% to question 4. interpret a statistic appropriately can lead to incorrect
The participants’ answers to the first two questions assessments of risk.
suggest that the uncertainty affecting Y is not directly In summary, the results of Condition 1 show that the
visible in the presentation of the results. The answers to most common way of displaying results in the empirical
question 3, on the other hand, shed light on what the economics literature leads to an illusion of predictability, in
majority of our sample sees as being the main source that part of the uncertainty is invisible to the respondents.
of fluctuation in the dependent variable. The results In Condition 2, we test this interpretation by seeing
suggest that it is the uncertainty concerning the estimated whether the answers to Condition 1 are robust to different
coefficients that is seen to be important, not the magnitude levels of uncertainty.
of the SER. In the jargon of popular econometrics texts,
whereas respondents were sensitive to one of the two 4.2. Conditions 2–4
sources of prediction error, namely the sampling error,
they ignored the error term of the regression equation. If the presentation of the results causes the error term to
The apparent invisibility of the random component in the be ignored, then the answers of the decision makers should
presentation lures respondents into disregarding the error not change in different set-ups, regardless of the variance
term, and into confusing an outcome with its estimated of the error term, provided that its expectation is zero. To
expected value. test this, we change only the variance of the error term
In their answers to questions 3 and 4, the majority of in Condition 2 (see Fig. 2). Conditions 3 and 4 replicate
participants claim that if someone chooses X = 1, there is a Conditions 1 and 2, except that we add scatter plots and
50% probability of obtaining Y > 1.001, but that obtaining SER statistics — see Figs. 3 and 4.
702 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

Table 2
Characteristics of respondents.
Condition 1 2 3 4 5 6 Total %

Requests to participate 568 531 548 510 438 418 3013 –

Requests opened 143 152 140 131 113 98 777 26
Surveys completed 45 45 49 38 36 44 257 9
Position
Professor 17 14 19 18 17 22 107 42
Associate professor 8 7 12 10 6 2 45 18
Assistant professor 12 18 16 9 9 12 76 30
Lecturer 6 4 1 1 3 3 18 7
Other 2 2 1 0 1 5 11 4
Total 45 45 49 38 36 44 257
Use of regression analysis
Never 7 5 11 11 6 15 55 23
Some 11 16 17 10 17 13 84 36
Often 16 14 7 7 7 8 59 25
Always 5 5 8 6 6 7 37 16
Total 39 40 43 34 36 43 235
Average minutes spent 11.6 10.3 7.4 7.5 5.7 6.5 8.1
⟨Std. dev.⟩ ⟨12.0⟩ ⟨7.8⟩ ⟨7.1⟩ ⟨5.3⟩ ⟨3.9⟩ ⟨6.0⟩ ⟨7.7⟩

Table 3
Comparison of results for Conditions 1 to 6.
Condition 1 2 3 4 5 6
2
R 0.50 0.25 0.50 0.25 0.50 0.25
Scatter plot No No Yes Yes Yes Yes
Estimation resuls Yes Yes Yes Yes No No
Percentage of participants whose answer to:
Question (1) was X < 10 (Incorrect) 72 67 61 41 3 7
Question (2) was X < 10 (Incorrect) 71 70 67 47 3 15
Question (3) was above 80% (Incorrect) 60 64 63 32 9 7
Question (4) was approx. 50% (Correct) 84 88 76 84 91 93
Approximate correct answers are
Question 1 47 82 47 82 47 82
Question 2 67 116 67 116 67 116
Question 3 (%) 51 51 51 51 51 51
Question 4 (%) 50 50 50 50 50 50
Number of participants
Question 1 39 36 44 32 31 41
Question 2 35 30 39 32 30 39
Question 3 45 42 49 37 32 43
Question 4 44 41 49 37 32 43
Notes:
Question (1) What would be the minimum value of X that an individual would need to make sure that s/he obtains a positive outcome (Y > 0) with 95%
probability?
Question (2) What minimum, positive value of X would make sure, with 95% probability, that the individual obtains more Y than a person who has X = 0?
Question (3) Given that the 95% confidence interval for β is (a, b), if an individual has X = 1, what would be the probability that s/he gets Y > a?
Question (4) If an individual has X = 1, what would be the probability that s/he gets Y > β̂ ?
In Conditions 1, 3 and 5, a = 0.936, b = 1.067 and β̂ = 1.001; in Conditions 2, 4 and 6, a = 0.911, b = 1.13 and β̂ = 1.02.

The histograms of the responses to the four questions misperceptions demonstrated in the respondents’ answers
in Conditions 2–4 are remarkably similar to those of suggest that the way in which regression results are
Condition 1 (see Appendix B). These similarities are presented in publications can prevent even knowledgeable
displayed in Table 3. individuals from differentiating among different clouds of
The similarities between the responses in Conditions data points and uncertainties. At an early stage of our
1 and 2 show that – under the influence of the current investigation, we also conducted the same survey (using
methodology – economists are led to overestimate the Conditions 1 and 2) with a group of 50 graduate students in
effects of explanatory factors on economic outcomes. The economics at Universitat Pompeu Fabra who had recently
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 703

Fig. 6. Histograms for the responses to Condition 5. The top-left figure shows answers to question 1, the one on the top-right shows answers to question
2, the one on the bottom-left those to question 3, and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 31, 30, 32 and 32 responses to questions 1–4, respectively.

taken an advanced econometrics course, as well as with 30 standard errors, and fail to consider the uncertainty inher-
academic social scientists (recruited through the European ent in the relationships between the dependent and in-
Association for Decision Making). The results (not reported dependent variables. What happens, therefore, when they
here) were similar to those of our sample of economists, cannot see estimates of coefficients and related statistics,
and suggest that the origins of the misperceptions can be but have only a bivariate scatter plot? This is the essence
traced back to the methodology, as opposed to professional of Conditions 5 and 6 (see the graphs in Figs. 3 and 4).
backgrounds. Fig. 6 displays the histograms for the responses
Table 3 indicates that when the representation is to the four questions in Condition 5. The responses
augmented with a graph of actual observations and with to Condition 6 were similar, and the histograms are
statistical information on the magnitude of the error term displayed in Appendix B. These histograms show that the
(SER), the perceptions of the relevant uncertainty, and participants are much more accurate in their assessments
consequently the predictions, improve. However, around of uncertainty now than in the previous conditions (see
half of the participants still fail to take the error term also Table 3). In fact, when the coefficient estimates
into account when making predictions, and give answers are not available, they are forced to attend solely to
similar to those in Conditions 1 and 2 (see Appendix B the graph, which depicts the uncertainty within the
for histograms of responses to Conditions 3 and 4). This dependent variable adequately. This further suggests that
suggests that respondents still rely mainly on the table scant attention was paid to the graphs when the coefficient
showing the estimated coefficients and their standard estimates were present. Despite the unrealistic manner
errors as the main tool for assessing uncertainty. Since of presenting the results, Conditions 5 and 6 show that
the information provided in Conditions 3 and 4 is rarely a simple graph can be better suited to assessing the
provided in published papers, this does not provide much predictability of an outcome than a table with coefficient
hope for improvement. Possibly more drastic changes are estimates, or even than a presentation that includes both a
necessary. Conditions 5 and 6 were designed to test this graph and a table.
suggestion.
In Conditions 5 and 6, most of the participants, includ-
ing some of those who made the most accurate predic-
4.3. Conditions 5 and 6 tions, protested in their comments about the insufficiency
of the information provided for the task. They claimed that
Our results so far suggest that, when making predic- it was impossible to determine the answers without the
tions using regression analysis, economists pay an exces- coefficient estimates, and that all they did was to ‘‘guess’’
sive amount of attention to coefficient estimates and their the outcomes approximately. Yet their guesses were more
704 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

accurate than the predictions from the previous condi- not typically address explicit decision making questions,
tions, which were the result of a careful investigation of the the models can be used to estimate, say, the probability of
coefficient estimates and time-consuming computations. reaching a given level of output for a specific level of input,
Indeed, as Table 2 indicates, the respondents in Conditions as well as the economic significance of the findings. It is
5 and 6 spent significantly less time on the task than those also important to understand that a policy that achieves
in Conditions 1 and 2 (t (40) = 2.71 and t (40) = 2.38, a significantly positive effect ‘‘on average’’ might still
p = 0.01 and 0.02, respectively). be undesirable, because it leaves a large fraction of the
population worse off. Hence, the questions are essential
4.4. Effects of training and experience but ‘‘tricky’’ only in the sense that they are not the sorts
of questions which economists typically ask.
Table 2 shows that our sample of 257 economists Second, as was noted earlier, 26% of potential respon-
varied widely in terms of professorial rank and the use dents took the time to open (and look at?) our survey
of regression analysis in their work. We failed to find any questions, and 9% answered. Does this mean that our re-
relationship between the numbers of correct answers and spondents were biased, and if so, in what direction were
either professorial rank or frequency of using regression they biased? We clearly cannot answer this question, but
analysis. A higher percentage of statisticians, financial we can state that our sample contained a substantial num-
economists and econometricians performed well relative ber of respondents (257), who represent various differ-
to the average respondent (with 64%, 56%, and 51% ent characteristics of academic economists. Moreover, they
providing correct answers, respectively, compared to the were relevant respondents, in that they were recruited
overall average of 35%). When the answers were more worldwide from leading departments of economics, as
accurate, the average time spent was also slightly greater judged by publications in econometrics (Baltagi, 2007).
(10.2 min versus 9.3). Appendix C shows in detail the Third, by maintaining anonymity in the responses, we
characteristics and proportions of respondents who gave were unable to offer incentives to our respondents. How-
accurate answers in Conditions 1–4. ever, would incentives have made a difference? Clearly, we
cannot say without conducting a specific study. However,
5. Discussion the consensus from previous results in experimental eco-
nomics is that incentives increase effort and reduce the
We conducted a survey of the probabilistic predictions variance in the responses, but do not necessarily increase
made by economists on the basis of regression outputs the average accuracy (Camerer & Hogarth, 1999). We also
similar to those published in leading economics journals. note that when professionals are asked questions which
When given only the regression statistics which are relate to their level of competence, there is little incen-
typically reported in such journals, many respondents tive to provide casual answers. Interestingly, our survey is
made inappropriate inferences. In particular, they seemed a good simulation of the circumstances under which many
to locate the uncertainty of prediction in estimates of the economists read journal articles: there are no explicit mon-
regression coefficients, but not in the standard error of etary incentives; readers do not wish to make additional
the regression (SER). Indeed, the responses hardly differed computations or to do work to fill in gaps left by the au-
depending on whether the fit of the estimated model was thors; and time is precious. Thus, the presentation of re-
0.25 or 0.50. sults is crucial.
We also provided some respondents with scatter plots Since our investigation concerns the way in which
of the regression, together with explicit information on the statistical results are presented in academic journals,
SER. However, this had only a small ameliorative effect, it is important to ask what specific audience authors
suggesting that respondents relied principally on the have in mind. The goal in leading economics journals
regression statistics (e.g., coefficients and their standard is scientific: to identify which variables have an impact
errors) when making their judgments. Finally, we forced on some economic output and to assess the strength of
other respondents to rely on a graphical representation by the relationship. Indeed, the discussion of results often
providing only a scatter plot, with no regression statistics. involves terms such as a ‘‘strong’’ effect, where the rhetoric
Members of this group complained that they did not have reflects the size of t-statistics and the like. Moreover, the
sufficient information, but – most importantly – were more strength of a relationship is often described only from the
accurate in their responses than the other groups, and also perspective of an average effect, e.g., that a unit increase
took less time to answer. in an independent variable implies a δ increase in the
Several issues could be raised about our study, in rela- dependent variable, on average.
tion to the nature of the questions asked, the specific re- As preliminary statements of the relevance of specific
spondents recruited, and their motivations for answering economic variables, this practice is acceptable. Indeed, al-
our questions. We now address these issues. though authors undoubtedly want to emphasize the sci-
First, we deliberately asked questions that are usually entific importance of their findings, we see no evidence of
not posed in journal articles because we sought to deliberate attempts to mislead readers into believing that
illuminate economists’ appreciations of the predictability the results imply a greater control over the dependent vari-
of economic relationships, as opposed to the assessment able than is, in fact, the case. In addition, the papers have
of the ‘‘significance’’ of certain variables (McCloskey & been reviewed by peers who are typically not shy about ex-
Ziliak, 1996; Ziliak & McCloskey, 2004, 2008). This is pressing their reservations. However, from a decision mak-
important. For example, even though economics articles do ing perspective, the typical form of presentation can lead to
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 705

an illusion of predictability of the outcomes, given the un- would be less accurate if the law of large numbers did
derlying regression model. Specifically, there can be a con- not hold. Hence, in more realistic scenarios, where our
siderable degree of variability around the expectations of assumptions are not valid, decisions that are weighted
effects, which needs to be calibrated in the interpretation towards expected values and coefficient estimates would
of results. Thus, readers who don’t ‘‘go beyond the informa- be even less accurate than our results indicate.
tion given’’ and take the trouble to calculate, say, the impli- How then can current practice be improved? Our
cations of some decision-oriented questions, may gain an results show that providing graphs alone led to the
inaccurate view of the results obtained. most accurate inferences. However, since this excludes
At one level, it could be argued that the principle of the actual statistical analysis evaluating the relationships
caveat emptor should apply. That is, consumers of eco- between different variables, we do not deem it a practical
nomic research should know how to use the information solution. Nevertheless, we do believe that it is appropriate
provided, and it is their responsibility to assess the uncer- to present graphs together with summary statistics, as we
tainty appropriately. It is not the fault of either the authors did in Conditions 3 and 4, although this methodology does
or the journals if they cannot. However, we make two argu- not eliminate the problem.
ments against the caveat emptor principle, as applied here. We seriously doubt that any substantial modification
First, as has been demonstrated by our survey, even of current practice will be accepted. We therefore suggest
knowledgeable economists experience difficulty in going augmenting reports by requiring the authors to provide
beyond the information provided in typical outputs of internet links to simulation tools. These could explore
regression analysis. If one wants to make the argument different implications of the analysis, as well as let readers
that people ‘‘ought’’ to do something, then it should also pose different probabilistic questions. In short, we propose
be clearly demonstrated that they ‘‘can’’. that tools be provided which allow readers to experience
Second, given the vast numbers of economic reports the uncertainty in the outcomes of the regression.5
available, it is unlikely that most readers will take the In fact, we recently embarked on a test of the ef-
necessary steps to go beyond the information provided. fectiveness of simulations in facilitating probabilistic in-
As a consequence, by reading journals in economics they ferences (Hogarth & Soyer, 2011). In two experiments,
will necessarily acquire a false impression of what the conducted with participants at varying levels of statis-
knowledge gained from economic research allows one to tical sophistication, respondents were provided with an
say. In short, they will believe that economic outputs are interface where they sequentially sampled the outcomes
far more predictable than is actually the case. predicted by an underlying model. In the first, we tested
We make all of the above statements under the assump- responses to seven well-known probabilistic puzzles. The
tion that econometric models describe empirical phenom- second involved simulating the predictions of an estimated
ena appropriately. In reality, such models may suffer from regression model, given one’s choices, in order to make in-
a variety of problems associated with the omission of key vestment decisions. The results of both experiments are
variables, measurement errors, multicollinearity, or esti- unequivocal. Experience obtained through simulations led
mating the future values of predictors. It can only be shown to far more accurate inferences than attempts at analy-
that model assumptions are, at best, approximately sat- sis. Also, the participants preferred using the experiential
isfied (they are not ‘‘rejected’’ by the data). Moreover, methodology over analysis. Moreover, when aided by sim-
whereas the model-data fit is maximized within the par- ulation, participants who were naïve with respect to prob-
ticular sample observed, there is no guarantee that the abilistic reasoning performed as well as those with uni-
estimated relationships will be maintained in other sam- versity training in statistical inference. The results support
ples. Indeed, the R2 value estimated on a fitting sample our suggestion that the authors of empirical papers supple-
inevitably ‘‘shrinks’’ when predicting to a new sample, ment the outputs of their analyses with simulation models
and estimating the amount of shrinkage a priori is prob- that allow decision makers to ‘‘go beyond the information
given’’ and ‘‘experience’’ the outcomes of the model given
lematic. There is also evidence that statistical significance
their inputs.
is often wrongly associated with replicability (Tversky &
Although our suggestion would impose an additional
Kahneman, 1971; see also Hubbard & Armstrong, 1994).
burden on authors, it would reduce both effort and
Possibly, if authors discussed these issues further, people’s
misinterpretation on the part of readers, and would make
perceptions of the predictability of outcomes would im-
any empirical article a more accessible scientific product.
prove. However, these considerations are beyond the scope
Moreover, it has the potential to correct other statistical
of the present study.
misinterpretations that were not identified by our study.
Furthermore, because our aim was to isolate the
As such, we believe that our suggestion goes a long
impact of the presentation mode on predictions, we
way to toward increasing our understanding of economic
made many simplifying assumptions. For instance, errors
phenomena. At the same time, it also calls for additional
that are heteroskedastic and non-normally distributed,
research into understanding when and why different
or the presence of fewer observations at the more
presentation formats lead to misinterpretation.
extreme values of the dependent variable would also
increase prediction error. Even though many estimation
procedures do not require assumptions, such as that of 5 For example, by following the link http://www.econ.upf.edu/∼soyer/
normally distributed random disturbances, in order to Emre_Soyer/Econometrics_Project.html, the reader can investigate many
obtain consistent estimates, the explanations which they questions concerning the two regression set-ups that we examined in this
provide through coefficient estimates and average values paper, and can also experience simulated outcomes.
706 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

In addition to suggesting changes in the way in A.2. Answers to questions 1 and 2

which statistical results are reported in journals in
order to produce better inferences, our results also have In the first two questions, participants are asked to
implications for the teaching of statistical techniques. First, advise a hypothetical individual who desires to have a
textbooks should provide a better coverage of the way certain level of control over the outcomes. This corre-
to report statistical results, as well as instructions as to sponds to the desire to obtain a certain amount of Y
how to make probabilistic predictions. Even a cursory through some action X . The first question reflects the
examination of leading textbooks shows that the topic of desire to obtain a positive outcome, whereas the second
reporting currently receives little attention, while decision reflects the desire to be better off with respect to an alter-
making is only considered through the construction of native of no-action. If one considers only averages, the esti-
confidence intervals around predicted outcomes. mation results suggest that an individual should expect the
Together with estimating average effects, evaluating relationship between X and Y to be one to one. However,
the predictive ability of economic models should become when could an individual claim that a certain outcome has
an important component of econometrics teaching. In- occurred because of their actions, and is not merely due to
deed, if linked to the development and use of simulation chance? How much does chance have to say in the real-
methods, it could become a most attractive (and illuminat- ization of an outcome? The answers to these questions de-
ing) part of any econometrics syllabus. pend on the standard deviation of the estimated residuals
Finally, we note that scientific knowledge advances to (SER).
the extent that we are able to forecast and control different In a linear regression analysis, SER2 corresponds to
phenomena. However, if we cannot make appropriate the variance of the dependent variable that cannot be
probabilistic statements about our predictions, our ability explained by the independent variables, and is captured by
to assess our level of knowledge accurately is seriously the statistic (1 − R2 ). In Conditions 1 and 3, this is given as
compromised. 50%. One can compute the SER using the (1 − R2 ) statistic
and the variance of Y :
Acknowledgments

SDER = se(ê) = Var(Y )(1 − R2 )

The authors are particularly grateful to the economists
= (40.782 )(0.5) ≈ 29. (A.1)
who took the time to answer the survey. In addition, they
are indebted to several colleagues for excellent advice on
matters ranging from planning the study, to its imple- The answer to the first question can be calculated
mentation, and the paper itself. These include Manel Bau- approximately by constructing a one-sided 95% confidence
cells, Michael Greenacre, Gaël Le Mens, Stephan Litschig, interval using Eq. (A.1). We are looking for a value of X
Johannes Mueller-Trede, Omiros Papaspiliopoulos, Gue- where
orgui Kolev, and especially Nick Longford, Rosemarie
 
Ĉ + β̂ X
Nagel, and Thijs van Rens. We also thank the late Arnold Prob Z > −
Zellner for his comments on the work, and dedicate the se(ê)
paper in honor of his memory. Comments made at semi- 
0.32 + 1.001X

nars at Universitat Pompeu Fabra, at the Spanish Economic = Prob Z > −
Association Symposium (SAEe) 2010, and at the Royal 29
Economic Society Ph.D. Meeting 2010 were also much ap- = 0.95, where Z ∼ N (0, 1). (A.2)
preciated. The research was supported by grants SEJ2006-
14098 and EC02009-09834 from the Spanish Ministerio de
Thus, to obtain a positive payoff with 95% probability,
Ciencia e Innovación.
an individual has to choose:
Appendix A. Rationale for answers to the four questions (1.645 ∗ 29 − 0.32)
X = ≈ 47. (A.3)
1.001
A.1. Preliminary comments The answer to the second question requires one
additional calculation. Specifically, we need to know the
We test whether or not decision makers who are knowl- standard deviation of the difference between two random
edgeable about regression analysis evaluate the unpre- variables, that is
dictability of an outcome correctly, given the standard
presentation of linear regression results in an empirical (Yi | Xi = xi ) − (Yi | Xi = 0), where xi > 0. (A.4)
study. To isolate the effects of a possible misperception, we We know that (Yi | Xi ) is an identically, independently
created a basic specification. In this hypothetical situation, and normally distributed random error with an estimated
a continuous variable X causes an outcome Y , and the ef- standard deviation of 29. Given that different and indepen-
fect of one more X is estimated to be almost exactly equal dent shocks occur for different individuals and actions, the
to 1. The majority of the fluctuation in Y is due to a random standard deviation of Eq. (A.4) becomes:
disturbance uncorrelated with X , which is normally and 
independently distributed, with constant variance. Hence, Var[(Yi | Xi = xi ) − (Yi | Xi = 0)]
the decision maker knows that all of the assumptions of

= Var(Yi | Xi = xi ) + Var(Yi | Xi = 0)
the classical linear regression model hold (see, e.g., Greene, 
2003). = 292 + 292 ≈ 41. (A.5)
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 707

Thus, the answer to question 2 is: Question 4 asks about the probability of obtaining an
outcome above the point estimate, given a value of X = 1.
(1.645 ∗ 41 − 0.32) In Conditions 1, 3 and 5, the point estimate is 1.001. We can
X = ≈ 67. (A.6)
1.001 use similar calculations in order to obtain an answer.
Similar reasoning is involved for Condition 2 (and
Pr(Yi > 1.001 | Xi = 1)
thus also Conditions 4 and 6). For these conditions, the
equivalent of Eq. (A.1) is = Pr(Ĉ + β̂ Xi + ê > 1.001 | Xi = 1)
 = Pr(ê > 1.001 − Ĉ − β̂ Xi | Xi = 1)
SDER = se(ê) = Var(Y )(1 − R2 )  
ê 1.001 − Ĉ − β̂ Xi
>

(59.252 )(0.75) ≈ 51, = Pr | Xi = 1
= (A.7) se(ê) se(ê)
1.001 − 0.32 − 1.001
 
such that the answer to question 1 is:
=1−Φ
29
(1.645 ∗ 51 − 0.62)
X = ≈ 82. (A.8) = 1 − Φ (−0.01) ≈ 0.5. (A.12)
1.02
As for question 2, we need to find out about Eq. (A.4) in For questions 3 and 4 of Condition 2 (and thus also 4
this condition: and 6), we follow a similar line of reasoning, using the
 appropriate estimates. Thus, for question 3,
Var(Yi | Xi = xi ) + Var(Yi | Xi = 0)
Pr(Yi > 0.911 | Xi = 1)

= 512 + 512 ≈ 72, (A.9)
= Pr(Ĉ + β̂ Xi + ê > 0.911 | Xi = 1)
so that the answer to question 2 in Condition 2 becomes:
= Pr(ê > 0.911 − Ĉ − β̂ Xi | Xi = 1)
(1.645 ∗ 72 − 0.62)
 
X = ≈ 116. (A.10) ê 0.911 − Ĉ − β̂ Xi
1.02 = Pr > | Xi = 1
se(ê) se(ê)
0.911 − 0.61 − 1.02
 
=1−Φ
A.3. Answers to questions 3 and 4 51
= 1 − Φ (−0.015) ≈ 0.51, (A.13)
Here, we inquire about the way in which decision mak-
ers weight the different sources of uncertainty within the and for question 4,
dependent variable. The answers to these questions pro-
vide insights as to whether or not the typical presenta- Pr(Yi > 1.02 | Xi = 1)
tion of the results leads the participants to consider that = Pr(Ĉ + β̂ Xi + ê > 1.02 | Xi = 1)
the fluctuation around the estimated coefficient is a larger
source of uncertainty in the realization of Y than it really = Pr(ê > 1.02 − Ĉ − β̂ Xi | Xi = 1)
 
is. ê 1.02 − Ĉ − β̂ Xi
Question 3 asks about the probability of obtaining an = Pr > | Xi = 1
se(ê) se(ê)
outcome above the lower-bound of the 95% confidence
1.02 − 0.61 − 1.02
 
interval of the estimated coefficient, given a value of
X = 1.
=1−Φ
51
In Conditions 1, 3 and 5, the lower-bound is 0.936. We
= 1 − Φ (−0.01) ≈ 0.5. (A.14)
can find an approximate answer to this question using the
estimated model and the SER from Eq. (A.1), that is

Pr(Yi > 0.936 | Xi = 1)

= Pr(Ĉ + β̂ Xi + ê > 0.936 | Xi = 1) Appendix B. Histograms for the answers to Conditions
= Pr(ê > 0.936 − Ĉ − β̂ Xi | Xi = 1) 2, 3, 4 and 6
 
ê 0.936 − Ĉ − β̂ Xi
= Pr > | Xi = 1 See Figs. B.1–B.4.
se(ê) se(ê)
0.936 − 0.32 − 1.001
 
=1−Φ Appendix C. Detailed experimental data for Conditions
29
1–4
= 1 − Φ (−0.013) ≈ 0.51, (A.11)

where Φ is the cumulative standard normal distribution. See Table C.1.

708 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

Fig. B.1. Histograms for the responses to Condition 2. The top-left figure shows answers to question 1, the one on the top-right shows answers to question
2, the one on the bottom-left those to question 3, and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 36, 30, 42 and 41 responses to questions 1–4, respectively.

Fig. B.2. Histograms for the responses to Condition 3. The top-left figure shows answers to question 1, the one on the top-right shows answers to question
2, the one on the bottom-left those to question 3, and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 44, 39, 49 and 49 responses to questions 1–4, respectively.
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 709

Fig. B.3. Histograms for the responses to Condition 4. The top-left figure shows answers to question 1, the one on the top-right shows answers to question
2, the one on the bottom-left those to question 3 and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 32, 32, 37 and 37 responses to questions 1–4, respectively.

Fig. B.4. Histograms for the responses to Condition 6. The top-left figure shows answers to question 1, the one on the top-right to shows answers question
2, the one on the bottom-left those to question 3 and the one on the bottom-right those to question 4. Each histogram also displays the question and the
approximate correct answer. The dark column identifies the responses that we considered correct. Above each column is the number of participants who
gave that particular answer. There were 41, 39, 43 and 43 responses to questions 1–4, respectively.
710 E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711

Table C.1
Relationships between training, experience and responses in Conditions 1–4 (the number of respondents with correct answers is given in parentheses).

Condition 1 2 3 4 Total over four Percentage of respondents with

conditions correct answers

Position
Professor 17 (4) 14 (5) 19 (6) 18 (11) 68 (26) 38
Associate Professor 8 (2) 7 (3) 12 (4) 10 (8) 37 (17) 46
Assistant Professor 12 (5) 18 (4) 16 (6) 9 (2) 55 (17) 31
Senior Lecturer 0 (0) 2 (1) 1 (0) 0 (0) 3 (1) 33
Lecturer 6 (1) 4 (0) 1 (0) 0 (0) 12 (1) 8
Post-Doctoral Researcher 2 (0) 0 (0) 0 (0) 0 (0) 2 (0) 0
Total 45 (12) 45 (13) 49 (13) 38 (21) 177 (62) 35
Research fields
Econometrics 14 (6) 11 (6) 10 (5) 14 (8) 49 (25) 51
Labor economics 12 (5) 11 (2) 14 (3) 10 (7) 47 (17) 36
Monetary economics 5 (1) 2 (0) 5 (2) 2 (0) 14 (3) 21
Financial economics 4 (1) 5 (3) 4 (3) 3 (2) 16 (9) 56
Behavioral economics 3 (1) 7 (2) 2 (1) 3 (0) 15 (4) 27
Developmental economics 8 (1) 2 (1) 9 (3) 5 (1) 24 (6) 25
Health economics 4 (0) 3 (0) 5 (1) 1 (1) 13 (2) 15
Political economy 3 (1) 5 (1) 7 (3) 4 (2) 19 (7) 37
Public economics 9 (1) 6 (1) 10 (4) 8 (6) 33 (12) 36
Environmental economics 1 (0) 2 (1) 3 (0) 2 (1) 8 (2) 25
Industrial organization 2 (1) 6 (1) 6 (1) 2 (1) 16 (3) 19
Game theory 4 (1) 1 (1) 4 (1) 5 (2) 14 (5) 36
International economics 6 (2) 6 (0) 7 (1) 2 (1) 21 (4) 19
Macroeconomics 9 (2) 9 (2) 13 (2) 6 (5) 37 (11) 30
Microeconomics 11 (2) 4 (2) 11 (5) 7 (4) 33 (13) 39
Economic history 2 (0) 2 (0) 6 (3) 2 (1) 12 (4) 33
Statistics 3 (1) 4 (4) 1 (1) 1 (1) 11 (7) 64
Other 0 (0) 0 (0) 1 (1) 0 (0) 1 (1) 100
Use of regression analysis
Never 7 (1) 5 (0) 11 (7) 11 (5) 34 (13) 38
Some 11 (4) 16 (6) 17 (0) 10 (5) 54 (15) 28
Often 16 (4) 14 (5) 7 (2) 7 (6) 44 (17) 39
Always 5 (3) 5 (1) 8 (4) 6 (2) 24 (10) 42
Total 39 (12) 40 (12) 43 (13) 34 (18) 156 (55) 35
Average minutes spent 12 (10.9) 10.6 (12.6) 7.4 (11.2) 7.5 (7.4) 8.1 (10.2) 8.1
Std. dev. 12 (9.4) 7.8 (9) 7.1 (12.3) 5.3 (5.2) 7.7 (9) 7.7

References Hendry, D. F., & Nielsen, B. (2007). Econometric modeling: a likelihood

approach. NJ: Princeton University Press.
Armstrong, J. S. (2007). Significance tests harm progress in forecasting. Hogarth, R. M., & Soyer, E. (2011). Sequentially simulated outcomes: kind
International Journal of Forecasting, 23, 321–327. experience vs. non-transparent description. Journal of Experimental
Baltagi, B. H. (2007). Worldwide econometrics rankings: 1989–2005. Psychology: General, 140, 434–463.
Econometric Theory, 23(5), 952–1012. Hubbard, R., & Armstrong, J. S. (1994). Replications and extensions in
Camerer, C. F. (2000). Prospect theory in the wild: evidence from the marketing—rarely published but quite contrary. International Journal
field. In D. Kahneman, & A. Tversky (Eds.), Choice, values, and frames of Research in Marketing, 11, 233–248.
(pp. 288–300). New York, NY: Russell Sage Foundation & Cambridge Jensen, M. C. (1968). The performance of mutual funds in the period
University Press. 1945–1964. Journal of Finance, 23(2), 389–416.
Camerer, C. F., & Hogarth, R. M. (1999). The effects of financial incentives Judge, G. G., Griffiths, W., Hill, C. R., & Lee, T. C. (1985). Theory and practice
in experiments: a review and capital-labor-production framework. in econometrics. New York: Wiley.
Journal of Risk and Uncertainty, 19, 7–42.
Kahneman, D., & Tversky, A. (1979). Prospect theory: an analysis of
Carhart, M. (1997). On persistence in mutual fund performance. Journal of decision under risk. Econometrica, 47, 263–291.
Finance, 52(1), 57–82.
Lawrence, M., & Makridakis, S. (1989). Factors affecting judgmental
Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L. M., &
forecasts and confidence intervals. Organizational Behavior and
Woloshin, S. (2007). Helping doctors and patients make sense of
Human Decision Processes, 42, 172–187.
health statistics. Psychological Science in the Public Interest, 8(2),
53–96. McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions.
Goldstein, D. G., & Taleb, N. N. (2007). We don’t quite know what we Journal of Economic Literature, 34, 97–114.
are talking about when we talk about volatility. Journal of Portfolio Schwab, A., & Starbuck, W. H. (2009). Null-hypothesis significance testing
Management, 33(4), 84–86. in behavioral and management research: we can do better. In D.
Greene, W. H. (2003). Econometric analysis (5th ed.). Upper Saddle River, Bergh, & D. Ketchen (Eds.), Research methodology in strategy and
NJ: Prentice Hall. management: Vol. 5 (pp. 29–54). Oxford, UK: Elsevier.
Gujarati, D. N., & Porter, D. (2009). Basic econometrics. New York: McGraw- Simon, H. A. (1978). Rationality as process and product of thought.
Hill Irwin. American Economic Review, 68(2), 1–16.
E. Soyer, R.M. Hogarth / International Journal of Forecasting 28 (2012) 695–711 711

Thaler, R. H., & Sunstein, C. R. (2008). Nudge: improving decisions about Emre Soyer is a Ph.D. student in the Department of Economics & Business
health, wealth, and happiness. New Haven, CT: Yale University Press. at Universitat Pompeu Fabra, Barcelona. A graduate of Koc University
Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. (Istanbul, Turkey) and the University of Nottingham (U.K), he is interested
Psychological Bulletin, 76, 105–110. in ways of structuring situations so as to help unleash human potential
Wooldridge, J. M. (2008). Introductory econometrics: a modern approach across a wide range of areas, ranging from simple decision problems to
(3rd ed.). International Student Edition, Thomson, South Western. the content of educational programs.
Zellner, A. (1984). Posterior odds ratios for regression hypotheses: general
considerations and some specific results. In A. Zellner (Ed.), Basic
issues in econometrics (pp. 275–305). Chicago, IL: The University of
Chicago Press. Robin M. Hogarth is an ICREA Research Professor in the Department of
Zellner, A. (2004). To test or not to test and if so, how? Comments on ‘‘size Economics & Business at Universitat Pompeu Fabra, Barcelona. He has
matters’’. Journal of Socio-Economics, 33, 581–586. previously held appointments at INSEAD, London Business School, and
Ziliak, S. T., & McCloskey, D. N. (2004). Size matters: the standard error the University of Chicago. He has published several books (most recently
of regressions in the American Economic Review. Journal of Socio- Dance with Chance with Spyros Makridakis and Anil Gaba) and many
Economics, 33, 527–546. articles in psychology, management, and economics on topics related to
Ziliak, S. T., & McCloskey, D. N. (2008). The cult of statistical significance: human decision making. He is a past President of both the Society for
how the standard error costs us jobs, justice, and lives. Ann Arbor: Judgment and Decision Making and the European Association for Decision
University of Michigan Press. Making.

Comprehensive Look at Equity Premium
No ratings yet
Comprehensive Look at Equity Premium
90 pages
Introduction 1
No ratings yet
Introduction 1
113 pages
SSRN 3667830
No ratings yet
SSRN 3667830
55 pages
(BOOK) A Primer in Econometric Theory - Stachurski 2016
No ratings yet
(BOOK) A Primer in Econometric Theory - Stachurski 2016
398 pages
Econsig 1 Dec 2021
No ratings yet
Econsig 1 Dec 2021
56 pages
Sanet ST
No ratings yet
Sanet ST
385 pages
Econ 4
No ratings yet
Econ 4
92 pages
Comprehensive Guide to Statistics
From Everand
Comprehensive Guide to Statistics
Mohit Chatterjee
No ratings yet
A Voice Cries Out - The Bible's Greatest Prophecies
100% (2)
A Voice Cries Out - The Bible's Greatest Prophecies
288 pages
Econometrics: The Essentials
From Everand
Econometrics: The Essentials
Samir Ganaka
No ratings yet
Sosvilla Rivero2018
No ratings yet
Sosvilla Rivero2018
17 pages
Unit 2 - Economic Analysis
No ratings yet
Unit 2 - Economic Analysis
138 pages
Hendry 2003
No ratings yet
Hendry 2003
29 pages
SSRN Id1621800
No ratings yet
SSRN Id1621800
38 pages
Goals of Econometrics
No ratings yet
Goals of Econometrics
2 pages
Regression PDF
No ratings yet
Regression PDF
104 pages
Exchange Rate Forcasters
No ratings yet
Exchange Rate Forcasters
40 pages
Week 1
No ratings yet
Week 1
47 pages
Machine Learning
No ratings yet
Machine Learning
92 pages
Regression: An Introduction To Econometrics
No ratings yet
Regression: An Introduction To Econometrics
19 pages
Forecasting Methods in Finance
No ratings yet
Forecasting Methods in Finance
39 pages
Course0 Introduction
No ratings yet
Course0 Introduction
14 pages
Unit 1 The Essentials of Forecasting
No ratings yet
Unit 1 The Essentials of Forecasting
56 pages
151 239
No ratings yet
151 239
89 pages
Week01 RegressionWithPanelDataPart1
No ratings yet
Week01 RegressionWithPanelDataPart1
37 pages
Working Paper Series: A New Theory of Forecasting
No ratings yet
Working Paper Series: A New Theory of Forecasting
43 pages
Research Methods PDF
100% (1)
Research Methods PDF
48 pages
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
No ratings yet
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
50 pages
Statistical Method from the Viewpoint of Quality Control
From Everand
Statistical Method from the Viewpoint of Quality Control
Walter A. Shewhart
4.5/5 (5)
Economic Forecasting Navigating Uncertain Market Trends
No ratings yet
Economic Forecasting Navigating Uncertain Market Trends
11 pages
Islp 4
No ratings yet
Islp 4
5 pages
2018 Alberto Rossi Fall Seminar Paper 1 Stock Market Returns
No ratings yet
2018 Alberto Rossi Fall Seminar Paper 1 Stock Market Returns
44 pages
Biostatistics Explored Through R Software: An Overview
From Everand
Biostatistics Explored Through R Software: An Overview
Vinaitheerthan Renganathan
3.5/5 (2)
Econometric Forecasting
No ratings yet
Econometric Forecasting
86 pages
Regression and Causation: A Critical Examination of Six Econometrics Textbooks
No ratings yet
Regression and Causation: A Critical Examination of Six Econometrics Textbooks
25 pages
Answer Key - MDS5 (Q4 & Q5)
No ratings yet
Answer Key - MDS5 (Q4 & Q5)
16 pages
Undergraduate Econometrics Instruction: Through Our Classes, Darkly
No ratings yet
Undergraduate Econometrics Instruction: Through Our Classes, Darkly
20 pages
Accounting Information Systems
No ratings yet
Accounting Information Systems
22 pages
Mastering Metrics Published
No ratings yet
Mastering Metrics Published
4 pages
WHAT'S TO KNOW ABOUT THE CREDIBILITY OF EMPIRICAL ECONOMICS? Ioannidis 2013
No ratings yet
WHAT'S TO KNOW ABOUT THE CREDIBILITY OF EMPIRICAL ECONOMICS? Ioannidis 2013
8 pages
Forecasting With Economic News
No ratings yet
Forecasting With Economic News
13 pages
Advanced Studies in Theoretical and Applied Econometrics: Series Editors
No ratings yet
Advanced Studies in Theoretical and Applied Econometrics: Series Editors
19 pages
ReportCapstoneP0908 - Final
No ratings yet
ReportCapstoneP0908 - Final
54 pages
Common Errors in Statistics (and How to Avoid Them)
From Everand
Common Errors in Statistics (and How to Avoid Them)
Phillip I. Good
No ratings yet
Rob Tall Man
No ratings yet
Rob Tall Man
15 pages
Anshul Dyundi Predictive Modelling Alternate Project July 2022
No ratings yet
Anshul Dyundi Predictive Modelling Alternate Project July 2022
11 pages
2011 L1 Final
No ratings yet
2011 L1 Final
89 pages
Haushofer Shapiro UCT Online Appendix
No ratings yet
Haushofer Shapiro UCT Online Appendix
256 pages
SSRN id356241EconomicForecastingLessonsL
No ratings yet
SSRN id356241EconomicForecastingLessonsL
38 pages
LottoArchitect 2 2-Helpfile
No ratings yet
LottoArchitect 2 2-Helpfile
39 pages
TS02D Ambaye 5521 PDF
No ratings yet
TS02D Ambaye 5521 PDF
27 pages
Forecasting DP
No ratings yet
Forecasting DP
6 pages
Research What Is Research?
No ratings yet
Research What Is Research?
72 pages
Amir Et Al 2022 Intelligent Based Hybrid Renewable Energy Resources Forecasting and Real Time Power Demand Management
No ratings yet
Amir Et Al 2022 Intelligent Based Hybrid Renewable Energy Resources Forecasting and Real Time Power Demand Management
33 pages
Customer Churn Case Study
100% (2)
Customer Churn Case Study
19 pages
Mock Exam Solution Empirical Methods For Finance
No ratings yet
Mock Exam Solution Empirical Methods For Finance
6 pages
GP 10-15 9 July 2008
No ratings yet
GP 10-15 9 July 2008
19 pages
Econ
No ratings yet
Econ
2 pages
IY2593 VT20 Summary Module 03
No ratings yet
IY2593 VT20 Summary Module 03
5 pages
Quantitative Data Analysis in Finance Forecasting Daily Volatilities of Global Stock Indexes
100% (1)
Quantitative Data Analysis in Finance Forecasting Daily Volatilities of Global Stock Indexes
34 pages
Jep 15 4 3
No ratings yet
Jep 15 4 3
8 pages
Anchoring Bias in Consensus Forecasts and Its Effect On Market Prices
No ratings yet
Anchoring Bias in Consensus Forecasts and Its Effect On Market Prices
40 pages
Paul LaViolette - Tracing The Origins of Subquantum Kinetics
100% (1)
Paul LaViolette - Tracing The Origins of Subquantum Kinetics
7 pages
Fernanda Sobrino JMP
No ratings yet
Fernanda Sobrino JMP
63 pages
BEHAVIOUR AND STRENGTH ASSESSMENT OF Masonry Prisms
No ratings yet
BEHAVIOUR AND STRENGTH ASSESSMENT OF Masonry Prisms
42 pages
Chapter 1 Nick Wilkinson
No ratings yet
Chapter 1 Nick Wilkinson
36 pages
Economics of Permissioned Blockchain Adoption
No ratings yet
Economics of Permissioned Blockchain Adoption
49 pages
Pedro Acosta Valenzuela: Education Princeton University
No ratings yet
Pedro Acosta Valenzuela: Education Princeton University
12 pages
Transformer Lifetime Prediction: Christian Osorio Nandan Sawant
No ratings yet
Transformer Lifetime Prediction: Christian Osorio Nandan Sawant
17 pages
Leich and Tanner
No ratings yet
Leich and Tanner
12 pages
A Quiet Opening: North Koreans in A Changing Media Environment
No ratings yet
A Quiet Opening: North Koreans in A Changing Media Environment
94 pages
BTech Phase 4 Presentation Template
No ratings yet
BTech Phase 4 Presentation Template
24 pages
Business Research Methods: Module 1: Meaning, Types, Criteria of Good Research, Marketing Research
No ratings yet
Business Research Methods: Module 1: Meaning, Types, Criteria of Good Research, Marketing Research
15 pages
Water Coning For V H Well
No ratings yet
Water Coning For V H Well
12 pages
2-Normal Form Games
No ratings yet
2-Normal Form Games
100 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
CH 1
No ratings yet
CH 1
5 pages
Problems With Economics
No ratings yet
Problems With Economics
16 pages
Glossary of Research Methodology
From Everand
Glossary of Research Methodology
Dr. Awadhesh Kishore
No ratings yet
Practical Advice On Matrix Games
No ratings yet
Practical Advice On Matrix Games
52 pages
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
China's Cyberstrategy
No ratings yet
China's Cyberstrategy
17 pages
A Comparison of Dynamic Pile Driving Formulas With The Wave Equation
No ratings yet
A Comparison of Dynamic Pile Driving Formulas With The Wave Equation
22 pages
(23444150 - Journal of Heterodox Economics) A Critical Review of The Main Approaches On Financial Market Dynamics Modelling
No ratings yet
(23444150 - Journal of Heterodox Economics) A Critical Review of The Main Approaches On Financial Market Dynamics Modelling
17 pages
Lecture 4: Model Free Control: Emma Brunskill
No ratings yet
Lecture 4: Model Free Control: Emma Brunskill
66 pages
Promptrobust: Towards Evaluating The Robustness of Large Language Models On Adversarial Prompts
No ratings yet
Promptrobust: Towards Evaluating The Robustness of Large Language Models On Adversarial Prompts
26 pages
Prediction
No ratings yet
Prediction
6 pages
Seminar Literature Review - Deepfake Detection - Rizkiaji Putro
No ratings yet
Seminar Literature Review - Deepfake Detection - Rizkiaji Putro
22 pages
Lecture 5: Value Function Approximation: Emma Brunskill
No ratings yet
Lecture 5: Value Function Approximation: Emma Brunskill
59 pages
Mostly Harmless Econometrics BOOK REVIEW
No ratings yet
Mostly Harmless Econometrics BOOK REVIEW
2 pages
Living in A World of Low Levels of Predictability International Journal of Forecasting With N. Taleb
No ratings yet
Living in A World of Low Levels of Predictability International Journal of Forecasting With N. Taleb
5 pages
Deep Learning Based Forecasting of Critical Infrastructure Data
No ratings yet
Deep Learning Based Forecasting of Critical Infrastructure Data
10 pages
The Long-Term Effects of Africa's Slave Trades
No ratings yet
The Long-Term Effects of Africa's Slave Trades
40 pages
Data Science For Supply Chain Whitepaper
No ratings yet
Data Science For Supply Chain Whitepaper
7 pages
An Empirical Model of Social Insurance at The End of The Life Cycle
No ratings yet
An Empirical Model of Social Insurance at The End of The Life Cycle
22 pages
Backdoor Attacks For In-Context Learning With Language Models
No ratings yet
Backdoor Attacks For In-Context Learning With Language Models
11 pages
Associative Classi Cation Approaches: Review and Comparison: Neda Abdelhamid
No ratings yet
Associative Classi Cation Approaches: Review and Comparison: Neda Abdelhamid
30 pages
Game Theory: Guillem Roig
No ratings yet
Game Theory: Guillem Roig
28 pages
Comparison of Machine Learning Algorithms For Predicting Crime Hotspots
No ratings yet
Comparison of Machine Learning Algorithms For Predicting Crime Hotspots
4 pages
Economics and Identity
No ratings yet
Economics and Identity
23 pages
The Role of Validation in Toxicology
No ratings yet
The Role of Validation in Toxicology
8 pages
Unit 3
No ratings yet
Unit 3
5 pages
Everyday Behavior Over Time Graphs: Prepared With The Support of
No ratings yet
Everyday Behavior Over Time Graphs: Prepared With The Support of
19 pages
A New Approach To Classification Based On Association Rule Mining
No ratings yet
A New Approach To Classification Based On Association Rule Mining
16 pages
A Theoretical and Empirical Investigation of Job Satisfaction and Intended Turnover in The Large Cpa Firm
No ratings yet
A Theoretical and Empirical Investigation of Job Satisfaction and Intended Turnover in The Large Cpa Firm
16 pages
Chapter 2: Decision Making: The Decision-Making Process According To David H. Holt
100% (1)
Chapter 2: Decision Making: The Decision-Making Process According To David H. Holt
2 pages
Hydrocephalus
No ratings yet
Hydrocephalus
8 pages
Data-Driven Content Analysis of Social Media: A Systematic Overview of Automated Methods
No ratings yet
Data-Driven Content Analysis of Social Media: A Systematic Overview of Automated Methods
17 pages
Honest Causal Forests
No ratings yet
Honest Causal Forests
5 pages
Predicting The Safety of Infrastructure Projects Through MATLAB Neural Network
No ratings yet
Predicting The Safety of Infrastructure Projects Through MATLAB Neural Network
3 pages
Twenty Year Economic Impacts of Deworming
No ratings yet
Twenty Year Economic Impacts of Deworming
3 pages
Wpbe Se
No ratings yet
Wpbe Se
2 pages

International Journal of Forecasting: Emre Soyer Robin M. Hogarth

Uploaded by

International Journal of Forecasting: Emre Soyer Robin M. Hogarth

Uploaded by

International Journal of Forecasting 28 (2012) 695–711

Contents lists available at SciVerse ScienceDirect

International Journal of Forecasting

The illusion of predictability: How regression statistics mislead experts

article info abstract

1. Introduction between the independent and dependent variables. It is

. . . use linear regression analysis 42 41 15 13 111 x

1. What would be the minimum value of X that an

We also regarded as correct the responses of four partici-

Requests to participate 568 531 548 510 438 418 3013 –

In addition to suggesting changes in the way in A.2. Answers to questions 1 and 2

Pr(Yi > 0.936 | Xi = 1)

where Φ is the cumulative standard normal distribution. See Table C.1.

Condition 1 2 3 4 Total over four Percentage of respondents with

References Hendry, D. F., & Nielsen, B. (2007). Econometric modeling: a likelihood

You might also like