Understanding the relationship between the perception of risks and benefits

ORCID_LOGO based on reviews by Katherine Fox-Glassman, Bjørn Sætrevik, Richard Brown and Toby Wise
A recommendation of:

Revisiting and updating the risk-benefits link: Replication of Fischhoff et al. (1978) with extensions examining pandemic related factors

Submission: posted 15 February 2022
Recommendation: posted 29 June 2022, validated 30 June 2022
Cite this recommendation as:
Chambers, C. (2022) Understanding the relationship between the perception of risks and benefits. Peer Community in Registered Reports, .


Everyday decisions involve weighing up many kinds of risks and benefits, prompting the question of how our perception of those risks relates to our perception of the associated benefits. Intuitively, we might assume that behaviours or practices that are judged by society as riskier would also be seen as carrying greater potential benefits, in keeping with the expression “high risk, high reward”. The psychology of risk perception, however, appears to be more complex. In a seminal study, Fischhoff et al. (1978) in fact found the opposite pattern: that perceived risk and perceived benefit were negatively correlated – behaviours or practices that were perceived to be higher risk tended to be perceived as carrying lower benefits. This counterintuitive finding has had a significant impact on the field of judgment and decision making, despite being subjected only rarely to close replication.
Using a large-scale online design, Frank and Feldman (2022) propose a replication that incorporates key elements of Fischhoff et al. (1978) as well as a recent replication by Fox-Glassman et al. (2016). In particular, the authors will reassess the strength and directionality of the relationship between perceived risks and perceived benefits, and how these relate to both risk characteristics and acceptable levels of risk. As part of a series of exploratory extensions, they will also examine the risk/benefit relationship for policies and practices related to the Covid-19 pandemic, including vaccinations, lockdowns, and social distancing.
The Stage 1 manuscript was evaluated over two rounds of in-depth review. Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA. 
List of eligible PCI RR-friendly journals:
1. Fischhoff, B., Slovic, P., Lichtenstein, S., Read, S., & Combs, B. (1978). How safe is safe enough? A psychometric study of attitudes towards technological risks and benefits. Policy Sciences, 9, 127-152. 
2. Fox-Glassman, K. T. & Weber, E. U. (2016). What makes risk acceptable? Revisiting the 1978 psychological dimensions of perceptions of technological risks. Journal of Mathematical Psychology, 75, 157-169.
3. Frank, J. M. & Feldman, G. (2022). Revisiting and updating the risk-benefits link: Replication of Fischhoff et al. (1978) with extensions examining pandemic related factors, in principle acceptance of Version 3 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #2

DOI or URL of the report:

Version of the report: v2

Author's Reply, 28 Jun 2022

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to:, updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R 2"

Decision by ORCID_LOGO, posted 20 Jun 2022

All four reviewers kindly returned to evaluate your revised submission, and I'm happy to say that all are now broadly satisifed with your previous revision and response. As you will see, there is one relatively minor issue outstanding concerning the use of independent t-tests that I would like to see addressed. Once this is settled in a final minor revision and response, I will issue IPA immediately without further review.

Reviewed by , 10 Jun 2022

Thanks to the authors for their clear and thorough documentation of changes, both within the manuscript and in the Response to Reviewers.

On all but one of the points, the authors' explanations/edits/justifications do a good job of addressing my questions and suggestions in the first review.

The only remaining reservation I have is about the between-subjects t-tests. The authors write that they decided to run them "primarily due to the between-subjects design of the original study," and that they "will really only tell us if participants are rating risks differently than they are rating benefits. Unfortunately, this is all that can be done given the design."

I fully agree with that last sentence: that there's not really more to do in this area given the design, and that that's unfortunate. But I am wary of running analyses (any analyses, but especially this many) for the primary reason that it's the only analysis that can be done, in the absence of a theoretical reason to do so. The line "to make the most of the replicated design, we will also be conducting independent samples t-tests..." is one that would raise statistical red flags for me as a reader.

Is there literature that could guide us on whether to expect people to rate benefits differently than they do risks? (I mean in a group-comparison context; obviously there's good theoretical reason to look for correlations between how people rate risks and benefits.) Face-validity-wise, I don't understand what it would mean to say that "people rate the risks as greater than the benefits for technologies/activities A, B, and C, but vice versa for technologies/activities X, Y, and Z," unless the Ps had been instructed in how to quantify both risk and benefits on some common (or at least comparable) scale. In the absence of training, it seems possible that many Ps would instinctually judge risks based on human lives lost (or injuries, etc.), but judge benefits based on more economic criteria. Or that people could think about the risks as those to the people exposed, and the benefits as to society in general. Or that loss aversion could cause people to be more sensitive on the low end of the scale when rating risks, compared to how they use the same range of numbers when rating benefits. Or any number of other different uses of the same set of numbers when rating the two different constructs.

No matter the reason, if there are any differences in the assigning of values to "risk" and "benefit," that would render meaningless any direct comparison between the two constructs.

If the authors are set on including these t-tests in the study, they should include sound theoretical motivation for what kinds of differences they expect to see (effect sizes & directions), and for how those differences would be interpreted. But I would recommend omitting this whole analysis—the paper can easily stand on its own in terms of contributing to the field without the t-tests.

Reviewed by , 16 Jun 2022

The authors appear to have addressed my comments, along with those of the reviewers, thoroughly. I believe this represents a well thought out and important registered report, and I very much look forward to seeing the results!

Reviewed by , 31 May 2022

I am satisfied that the authors have skilfully responded to the queries raised in my initial report on 4 Apr 2022. The changes and explanations provided have addressed all points raised by my review and I look forward to tracking the progress of this study. 

Kind regards, 
Richard Brown

Reviewed by ORCID_LOGO, 30 May 2022

I would like to thank the authors for their continued work on the project, and for their detailed and well-structured responses to my comments in the previous review round. I think the most or all of my comments have been adequately addressed, and I have no further issues that need responding to. 

Many of my previous comments invited the authors to consider changing some of their exploratory analyses to be confirmatory, in cases where there appeared to be good reasons to expect specific relationships. The authors have explained their preference to keep the balance between exploratory and confirmatory inferences as it was in the previous version. While I still think that a stronger emphasis on confirmatory analyses would improve the project, I thnk this should be up to the authors, and I respect their choices.

I'm sorry that my comment about "masked analysis" was unclear. I may have used unusual terminology, and should insted have said "blinded analysis". See e.g., for more about the approach, but I'm sure there are also more practical guides online. My comment was based on the PCI RR suggested evaluation criteria 1D to prevent undisclosed flexibility. My issue was that while the flexibility is limited for the confirmatory analyses, there appears to be room for flexibility in the planned exploratory analyses. Of course, flexibility comes with the territory for exploratory analyses. But the authors may want to protect themselves from parts of this, for example by "blinding" parts of the analyses, or "masking" parts of the data before starting the exploratory analysis. This could imply masking the identity or direction of the “benefit” and “risk” variables, the activity names and the different risk scales. That being said, I recognize that this would add an additional level of complexity on top of what is already a large and complex project. My suggestion may may not fit with the plans and approach the authors already have for their exploratory analyses, so I will understand if they chose to forego a blinded analysis.

My best wishes for your further work on this exciting project!

Evaluation round #1

DOI or URL of the report:

Author's Reply, 27 May 2022

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to: , updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R"

Decision by ORCID_LOGO, posted 12 Apr 2022

I have now received detailed and constructive evaluations from four reviewers. As you will see, the reviews are broadly enthusiastic about the submission and are rich in suggestions for optimising both the study design and quality of reporting in the Stage 1 manuscript. Among the wider headline issues, the reviews prompt for greater consideration of study rationale and background literature, clarification (and addition) of a range of vital methodological details, justification of sampling, design and analytic decisions, and justification of procedural deviations from the replication study. From an editorial perspective, all of the issues raised seem addressable, therefore I am pleased to invite you to address the comments in a comprehensive revision and response.

Reviewed by , 09 Apr 2022

Reviewed by ORCID_LOGO, 12 Apr 2022

Reviewed by , 04 Apr 2022

Please see attached file. 

Download the review

Reviewed by , 08 Apr 2022

This study intends to replicate the finding demonstrated in Fischhoff et al. (1978) that perceived risk and perceived benefit are inversely correlated. This is a good candidate for replication, as it is clearly a highly cited study, but as yet there have been no well-powered and/or pre-registered attempts to replicate its findings. Additionally, the COVID-19 pandemic has illustrated the real-world relevance of such results making it timely.

The study design appears largely well-thought through and clearly described, although I would like to see some more details on how the methods differ from the original study. There are also some potential statistical issues that could be addressed.

1.       The planned sample size doesn’t appear to account for potential exclusions due to poor data quality, will it still be well-powered even if e.g., 10% of subjects need to be excluded?

2.       Subjects will be recruited using MTurk – there are a couple of papers showing that Prolific provides better data quality (e.g., and it may be worth considering the choice of platform to maximise data quality

3.       It’s not entirely clear how the items used differ from those in the original study – it seems as though there are fewer items (14 + 4 rather than the 30 used in the original paper) and the activities/technologies asked about have also changed. It would be good to clarify this a little. 

4.       One concern with the number of items is potential lack of power, given that the planned analysis for Tasks 1a/1b involves averaging within item and then performing linear regression (assuming I’ve understood this correctly) – ultimately, it doesn’t matter how many subjects there are, if there are very few items and they aren’t strongly correlated this will be underpowered. With 14 items, a correlation of r=.68 could be detected with 95% power, but this is probably a lot higher than would be expected. It might be worth adding in more items and reducing the number of subjects. 

5.       The items related to COVID are described as exploratory but quite a lot of detail is provided regarding the planned analysis – it would probably be best to either remove this detail (to be filled in once data is collected, avoiding the appearance that it was a planned analysis), or explicitly treat this as pre-planned

6.       More detail on how outliers will be detected might be useful – e.g., are there statistical tests that will be used to determine whether a data point is an outlier? 

7.       It looks as though Bayes factors will be used to determine support for null hypotheses based on the figures, but this doesn’t seem to be described clearly in the methods.

8.       It might be worth running the original analyses (e.g., those using geometric means), even if they are flawed, just to enable more direct comparison between studies.

9.       It’s not entirely clear to me how the regression for Task 1c is going to be conducted. The best approach might be to use a multi-level model, allowing for random slopes across subjects, as this would use all the data available without needing to average anything.

Overall, this is an interesting paper and I look forward to seeing how it progresses.


Toby Wise

User comments

No user comments yet