Should we believe in the “belief in the law of small numbers?”

ORCID_LOGO based on reviews by Romain Espinosa and Kariyushi Rao
A recommendation of:

Revisiting the “Belief in the law of small numbers”: Conceptual replication and extensions Registered Report of problems reviewed in Tversky and Kahneman (1971) [Stage 1]

Submission: posted 23 February 2023
Recommendation: posted 19 June 2023, validated 19 June 2023
Cite this recommendation as:
Syed, M. (2023) Should we believe in the “belief in the law of small numbers?”. Peer Community in Registered Reports, .


Probability and randomness are foundational statistical concepts used not only throughout the sciences, but also in our daily lives to guide our behavior and make sense of the world. Their importance and widespread use may suggest that they are easy concepts to understand, yet that seems not to be the case. A classic article by Tversky and Kahneman (1971) on the “belief in the law of small numbers” revealed that professional psychologists tended to incorrectly perceive a small sample that is randomly drawn from a population as representative of that population. This finding has been hugely influential, inspiring myriad subsequent studies into error and bias when reasoning about probability. 
In the current study, Hong and Feldman (2023) propose a conceptual replication and extension of Tversky and Kahneman (1971). The original article was shockingly sparse on details regarding the method, sample, and findings, and, to our knowledge, has never been replicated. These facts are especially concerning given the foundational status that the article holds in the field. Hong and Feldman (2023) have developed a conceptual replication project, using the same approach and targeting the same claims from Tversky and Kahneman (1971), but modifying the wording of the stimuli for clarity and appropriateness for lay respondents. Although Tversky and Kahneman (1971) relied on professional psychologists as participants, many of their claims were not restricted to that population, but rather were generalized to all people—which is also how the findings have been subsequently applied. Thus, the change from professional to lay responders is entirely appropriate and the study will be diagnostic of the original claims.
Finally, Hong and Feldman (2023) extend the target study by manipulating the sample size indicated in the stimuli. Tversky and Kahneman (1971) relied on a single sample size in each scenario, leaving open the question as to how sample size might impact respondents’ reasoning. Accordingly, Hong and Feldman (2023) vary the sample size across the scenarios to determine whether participants answer differently as the sample size increases. 
The Stage 1 manuscript was evaluated over two rounds of in-depth review, the first round consisting of detailed comments from two reviewers and the second round consisting of a close read by the recommender. Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and was therefore awarded in-principle acceptance (IPA).
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.
List of eligible PCI RR-friendly journals:
1. Hong, C. K., & Feldman, G. (2023). Revisiting the “Belief in the law of small numbers”: Conceptual replication and extensions Registered Report of problems reviewed in Tversky and Kahneman (1971). In principle acceptance of Version 3 by Peer Community in Registered Reports.
2. Tversky, A., & Kahneman, D. (1971). Belief in the law of small numbers. Psychological Bulletin, 76(2), 105–110.   
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #2

DOI or URL of the report:

Version of the report: 2

Author's Reply, 18 Jun 2023

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to:, updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R 2"

Decision by ORCID_LOGO, posted 06 Jun 2023, validated 06 Jun 2023

June 6, 2023

Dear Authors,

Thank you for submitting your revised Stage 1 manuscript, “Revisiting the “Belief in the law of small numbers”: Conceptual replication and extensions Registered Report of problems reviewed in Tversky and Kahneman (1971),” to PCI RR.

As indicated in the previous decision letter, I reviewed the revised manuscript myself rather than returning it to the reviewers for further comment. I appreciate the careful and thorough approach you took to the revision, and believe that the current version is much improved and will be ready for a recommendation after some minor revisions.

Your study materials are such that they could be tinkered with endlessly to attempt to improve them. What you have crafted, following the helpful reviewer comments, is a solid set of questions that is certain to be informative and to stimulate further work. Thus, I am happy with the question wording and have only suggestions that are mostly cosmetic and aimed at clarifications.

1.     I understand your concern about the challenges of having many robustness analyses. However, in this case, it is not “many” but only one (per analysis), namely the removal of outliers. Moreover, I question the use of the analysis only if you fail to find support for your hypothesis, as this ignores the possibility that you found support because of the included outliers. In terms of integrating findings, if you find support in one analysis and lack of support in another, that provides some indication that the finding is “fragile” in some way, either because it is a weak signal or because it is highly dependent on specific conditions. This is a more informative result than a single test without a robustness check. Unless you can offer a strong and convincing argument to the contrary, I suggest you follow the procedure I describe here.

2.     Please elaborate on the “law of large numbers” in the Introduction (p. 13). It is introduced in passing at that point, and mentioned again later in the Introduction section, but the clearest statement about its importance for the study comes in the Method section (p. 36) where the potential competing hypothesis is discussed. Given that this appears to be a major motivation for the extension, a clearer and more consolidated treatment in the Introduction is needed.

3.     The subsection of the Introduction labeled “Exploratory directions” includes only that it will be updated in Stage 2. This should be removed completely. Although I am personally not super strict about modifications to the Introduction following IPA, some people are, and it is best to not intentionally introduce such into the process. If you know what the exploratory directions might look like and why they are useful, then say so, otherwise leave it be and introduce them as exploratory when doing the analysis.

4.     Similarly, the “Exploratory Extensions” subsection of the Method is too vague, with reference to “several dependent variables” being included. Here, given that it is the Method section and you plan to use these variables, you should include additional detail rather than remove them. The “exploratory analyses” section should also give some idea of what you plan to examine.

5.     I find it confusing to have the scholar versions included in Table 3, given those versions are not part of the study. Table 3 should only include the original and the lay versions, and the current Table 3 with the scholar versions can be added to supplemental if you think people would be interested in seeing them for future work.

6.     The role of the Bayes Factors is not clearly specified. Specifically, how will inferences be made if the Bayesian and NHST results diverge? What is the rationale for using a default Cauchy prior? That BFs are reported by default in the figures is not a sufficient reason to include them, rather they should be fully integrated if they are to be useful.

7.     You alternatively refer to the same process as “consent checks” and “attention checks.” I would stick to the former, as those are more descriptive of what they actually are.

8.     p. 41 – It states that “eight of the measures are replications” but I believe this should be seven.

9.     p. 41 – the first sentence after indicating the alpha threshold of .001 is confusing, referring to “up to six additional dependent variables, sever overall.” I was not clear what that meant.

Once again, I will review the revised version myself. I will attempt to so immediately upon submission, and assuming that you are attentive to the issues outlined above, I imagine I will be able to recommend an in-principle acceptance at that time.


Moin Syed

PCI RR Recommender

Evaluation round #1

DOI or URL of the report:

Version of the report: 1

Author's Reply, 28 May 2023

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to:, updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R"

Decision by ORCID_LOGO, posted 26 Apr 2023, validated 27 Apr 2023

​​​April 26, 2023

Dear Authors,

Thank you for submitting your Stage 1 manuscript, “Revisiting the “Belief in the law of small numbers”: Conceptual replication and extensions Registered Report of problems reviewed in Tversky and Kahneman (1971),” to PCI RR. 

The reviewers and I were all in agreement that you are pursuing an important project, but that the Stage 1 manuscript would benefit from some revisions. Accordingly, I am asking that you revise and resubmit your Stage 1 proposal for further evaluation. I do not expect to return the revised version back to the reviewers, but will act on the manuscript myself. 

The reviewers provided thoughtful, detailed comments with align with my own read of the proposal, so I urge you to pay close attention to them as you prepare your revision. A few points that require special attention:

1. The reviewers and I all had questions about your treatment of outliers, exclusions, and multiple hypothesis testing. Rather than having results-dependent approaches to these issues, you should treat the decisions as constituting a set of a priori robustness analyses. See reviewer comments for specific issues. I also question your decision to only include data from participants who completed the entire study, as there is an extensive literature on missing data highlighting how listwise deletion can often result in the largest bias. 

2. The Introduction section would benefit from some additional text about how you are conceptualizing the replication in relation to the target. That is, you frame the study as a conceptual replication but do not provide many details about why it constitutes such and the implications of the deviations. I know that the direct/conceptual distinction is widely used and accepted, but I tend to favor the Nosek & Errington (2020) perspective that shifts attention from the procedure to the claim, and thus does away with the distinction. After reading over the Tversky and Kahneman paper, they are quite loose with their claims, sometimes constraining them to psychologists/researchers whereas other times the claims see to be applied to all people. If you take the former claim, then yours is a test of generalizability,  whereas if you take the latter claims yours is a test of replication. I don’t raise this issue to force you to think about it as I do, but to highlight how and why it would be beneficial to clarify the nature of the replication. 

3. You indicate that Q7 was omitted because it was a repeated theme of Q5 and Q6, but you do not actually include what the question was and in what way it was a repeat. Additionally, based on the argument, one might wonder why both Q5 and Q6 are included—if Q7 is a repeat of both, is Q6 not a repeat of Q5? These questions can be briefly addressed by including the question and explaining how the theme is repeated. 

When submitting a revision, please provide a cover letter detailing how you have addressed the reviewers’ points.

Thank you for submitting your work to PCI RR, and I look forward to receiving your revised manuscript.

Moin Syed
PCI RR Recommender

Reviewed by ORCID_LOGO, 20 Apr 2023

Overall, I think that this is great work and will make a nice Registered Report. The authors spent a lot of time explaining the theory, their objective, and what they plan to do. I have only a few suggestions that I detail below. The more important concerns are: (i) outlier exclusion, and (ii) multiple-hypothesis testing.

Major Comments:

My main concern is about multiple-hypothesis testing (MHT). The authors mention it a bit at page 36 when they say that they will rerun the analyses with a stricter alpha if they fail to support the core hypothesis. First, if they reduce alpha, their probability to reject H0 will be even lower, so I don’t see how this could change their findings. Second, I do not think that reducing alpha arbitrarily is a good way to account for MHT. In general, I think that the authors should present uncorrected and corrected p-values all along the way. As their Study Design table demonstrates, all hypotheses aim to test the same question, i.e., whether there is a bias. So, all hypotheses are from the same family. I would strongly encourage the authors to present p-values corrected for a Family-wise error rate of 5% (or something in this direction). I suggest using Romano-Wolf correction as the pvalues might correlate (they test the same concept underlying the data). (We discuss it a bit here:

Outliers: I am not a big fan of excluding the outliers as they might convey some information. Have you thought about winsorizing those data instead? (Note: if you exclude them, do you exclude them for all the data analysis or only on this question?)

Minor comments:

Page 19: it is written "previous research demonstrated that people intuitively know that the larger a sample size is, the more likely to produce a uniform distribution".

—> This sentence is true because you consider here the population to be distributed according to a uniform distribution (in your next sentence). I would maybe suggest rephrasing it in a more general way (e.g., the empirical distribution tends to get closer to the population distribution when the sample size gets larger), no?

For Q1, I’d like to mention that the authors make two things vary (at the same time I assume?): the sample size for the exploratory study and the sample size for the confirmatory study. I’m fine with that because if the two sample sizes increase, the probability of replication should be larger. But it is not straightforward that both dimensions should jointly vary. (This holds for all other questions that involve two sample sizes.) [This relates to your paragraph on sample size manipulation.]

For Q2: I’m wondering whether increasing the sample sizes is the best option. Why not try to decrease it as well (i.e., going below 50). The main point is that, T&K say in their paper that the correct answer is 101 with a sample size of 100. Well, it is quite close to 100. (In my opinion, the participants’ error is very small.) The smaller the sample size, the larger should people change their answers. You might be able to better detect effects going below the original sample size. (Just a suggestion though.)

For Q4: I am wondering whether people understand well what a « likelihood » is. Btw, you are asking here the likelihood of having an association of 0.35 (like a point estimate?). I think that the likelihood to have this precise value is almost zero. In the original question, it is asked whether there is support for an association of 0.35. (So, it is in the confidence interval, which is more likely.) I feel like your extension questions are a bit differently framed from the original question.

It would be useful to have the type of answer respondents can give for each question. (Numerical input, probabilities, Likert, etc.)

In my view, the replication is between « close to far » and « far ». To be conservative, I’d suggest keeping « far ».

Exclusions: page 36, you did not mention whether you’ll include/exclude people who failed the attention checks.

Reviewed by , 25 Apr 2023

User comments

No user comments yet