DOI or URL of the report: https://osf.io/xr2vb/?view_only=4238d2ee3d654c4f908a94efea82a027
Version of the report: v3
Please find the reviewer response attached as a PDF.
As promised, I returned the manuscript to Zoltan Dienes for a final evaluation. He offers some remaining suggestions for streamlining the analysis plan. The idea to include the Bayesian analyses at Stage 2 as exploratory analyses (and therefore remove them from the Stage 1 manuscript) strikes me a sensible compromise give their lack of diagnosticity. However, I will leave you to consider these points and respond/revise. Provided you are able to respond comprehensively, we should be able to award in-principle acceptance without further in-depth review.
The authors have responded very thoroughly to my comments. I understand their attraction to Bayesian modelling - as a Bayesian myself - but I think the combination of frequentist and Bayesian approaches in the way suggested doesn't quite work. The Bayesian model is interpreted effectively as a significance test: Whether 0 is in or outside an (100-X)% interval is the same as being significant at the X% level (see https://psyarxiv.com/bua5n/ pp 6-8). Further, power analyses tell one if a study is underpowered or not; so that is already apparent from the frequentist analyses, and the Bayesian analysis does not add to that. Incidentally, just one point of phrasing: The authors refer to a "true non-significant" result. Significance or non-significance is a property of a particular test applied to a particular sample, not a property of the population. So what the authors mean is a "true H0".
Using the original study as a prior means the Bayesian posterior is a type of meta-analysis. That's good, but does not tell us whether this study is underpowered.
I would remove the Bayesian analyses from the pre-registration, as they do not actually influence conclusions; but the authors would of course be free to add them in an exploratory analysis section in the Stage 2, e.g. to get meta-analytic posterior estimates (though I wouldn't see if 0 is in or outside an HDPI, see previous ref).
DOI or URL of the report: https://osf.io/qcj5m?view_only=4238d2ee3d654c4f908a94efea82a027
Version of the report: v2
Reviewer response attached as PDF in addition to being uploaded to the OSF page.
The three reviewers who assessed your initial submission have now evaluated the revised manuscript, and the good news is that we are getting close to Stage 1 acceptance. You will find some remaining methodological points to address in two of the reviews, including a key point about streamlining the analysis (and consequentially the logical chain of inference), and the suggestion to remove exploratory analyses from the Stage 1 manuscript (with which I agree).
I will consult swiftly with Zoltan Dienes concerning your further revised submission to ensure that his points have been adequately addressed (especially his points 3 and 5, which are most important).
The authors provided a thoughtful consideration of, and response to, all of the concerns raised.
I would like to thank the authors for revising the manuscript based on the review comments. My opinion is that IPA could be granted for this proposed revised plan.
The following points are minor and should be confirmed by the recommender:
- In multiple regression equations, β usually represents the partial regression coefficient, and x etc. would represent predictor variables. Perhaps the brackets themselves may represent the predictor, but \( \hat{Y} \) also contains a bracketed name, which can be confusing, so I think it would be better to write it in the least misleading way possible.
- Since there was no cleaned manuscript, many typos, etc. may be present. It is recommended that the cleaned manuscript file be checked by multiple third-party eyes in the final version before IPA if possible.
The authors have addressed many of my points. There remain a few issues to resolve, the last one listed being most important.
1) " if any of the two tests were significant (p>.01 for the Shapiro-Wilk and p>.05 for the Kolmogorov-Smirnov)" The ">"s should be "<"s.
2) "Outliers for the CTQ scores were assessed using boxplots"
State how outlier is defined.
3) For the Bayesian analysis, why specifically 89% CIs? Why "AND HDIs"? But the bigger point is I don't know what role the Bayesian analyses play in the planned inference. What would count as the Bayesian analysis "concurring" with the frequentist one? A CI/HDI doesn't in itself allow rejecting or accepting H1 or H2. In fact the posterior distribution is guaranteed to give 100% probability to the claim the relevant effect exists. I suggest pick one analysis and stick with it.
4) To keep things clean, don't list exploratory analyses at this stage.
5) Most importantly, past relevant work found small to medium effect sizes, and the current study calculates power for small to medium effect sizes. That means the study is not powered to detect *all* plausible effects of interest. Thus a non-significant result would not count against the hypotheses of an effect being there. The authors cannot change N, so they should temper their conclusions such that a non-significant result just means reserve judgment.
DOI or URL of the report: https://osf.io/98wmk?view_only=4238d2ee3d654c4f908a94efea82a027
Cover letter with reviewer response, and manuscript document (in tracked changes) attached as PDFs.
I now have three very helpful and constructive reviews of your submission. As you will see, the reviewers are broadly positive about the prospects of your manuscript, although some significant work will be needed to meet the Stage 1 criteria and achieve in-principle acceptance (IPA).
Among the main concerns are:
1. The logical coherence of the introduction and rationale, including making clear how reduced mu-opioid receptor density is relate to increased reward sensitivity (a point raised in slightly different ways by two of the reviewers).
2. Considering the potentially confounding effects of expectancy.
3. Clarifying the precise details of the analysis plans and contingencies. For a revised manuscript, I would recommend generating and including analysis code on simulated data to verify suitability of the plans.
4. Clarifying the precise conditions that will confirm or disconfirm the predictions (which may entail the removal of redundant analyses). At present, the design plan does not sufficiently prespecify the conditions under which different conclusions will be drawn. This will require revision to both the main text and the study design table (while keeping the design table as succinct as possible).
5. Clarification of the level of bias control in the manuscript. In the submission checklist you selected Level 2: At least some data/evidence that will be used to answer the research question has been accessed and partially observed by the authors, but the authors certify that they have not yet observed the key variables within the data that will be used to answer the research question AND they have taken additional steps to maximise bias control and rigour (e.g. conservative statistical threshold; recruitment of a blinded analyst; robustness testing, multiverse/specification analysis, or other approach). Please add a section to the manuscript that makes clear the level of prior data observation that has taken place (and confirms the corresponding level of bias control achieved under the PCI RR taxonomy). The second part of the Level 2 definition does not appear to be tackled in your plans: additional steps to maximise bias control and rigour (e.g. conservative statistical threshold; recruitment of a blinded analyst; robustness testing, multiverse/specification analysis, or other approach). This will need to be comprehensively addressed to achieve IPA.
Overall, I believe the manuscript is sufficiently promising to invite a Major Revision. Your proposal addresses a scientifically valid question, and (from my own reading) strikes me as a innovative and valuable use of pre-existing data. Should you wish to revise, please ensure that you respond comprehensively to all of the issues raised above and in the reviews, including a point-by-point response to every comment of the reviewers, and a fully tracked-changes version of the revised manuscript.
In this manuscript, the authors describe a study in which they explore to what extent childhood adversity predicts acute subjective responses (“reward”) to mu-opioid agonists administered in a medical setting. This is a very interesting and important topic, a nice follow-up from the authors’ previous study, and well-written start to a manuscript. The study has good scientific validity, and the hypotheses seem rational. However, there are some small matters that require clarification as described below.
Introduction
In the introduction, the authors state that early adversity is associated with reduced mu-opioid receptor density. It is not clear, however, how reduced mu-opioid receptor density relate to increased reward after exogenously administered opioids?
The authors write,“Here, we examined whether childhood adversity increases risk of opioid misuse via enhanced positive drug effects.” Are the authors actually planning to measure opioid misuse? Otherwise, this statement should be modified, as it does not accurately describe the experimental question.
In the intro, describe mechanism of action of two drugs (ie do they act as pure mu agonists? How do the doses used compare to one another?).
Methods:
One potential difficulty with this design is that it is not clear what role expectancy effects play in this study. What were the patients told about the medication they would be receiving? Did they know they would be receiving an opioid? There is some evidence that childhood adversity predicts placebo response in the context of analgesia, so one concern is that differences in expectancy effects between subjects with low and high adversity could confound the analysis.
Another potential pitfall is that the authors’ sample of patients will not have sufficient variability on the CTQ to be able to conduct the planned analyses. Presumably most participants will not have any history of childhood trauma. How will the authors ensure a substantial enough range on the CTQ to obtain meaningful results?
For the future submission: in the analysis, the group that got remifentanil and the group that got oxycodone should be compared on subjective effects to make sure that the doses of the different drugs were matched on this metric.
I read this study with interest even though I am not a complete expert on the topic, as it attempts to test as a natural experiment the hypothesis that human childhood detrimental situations, for which indirect evidence has been accumulated in very controlled situations, are associated with later opioid effects. Since this is an observational study in a natural setting, I am conservative as to whether the authors can draw conclusions about causality in their hypothesis here, but there is no doubt that the present study will still provide useful findings. Below is a list of points that I believe should be addressed in advance for a better protocol.
Many of the questions I have raised here about analysis would not be particularly problematic if it were all exploratory analysis. However, if it is to be registered as a confirmatory analysis, please clarify each hypothesis and the criteria for evaluating and interpretation of the results.
This is a very interesting study making good use of a naturalistic situation to look at whether childhood adversity affects how people respond subjectively to opioids.
I didn't see any discussion of how bias is controlled, but I will presume the editor has this in hand.
My main point is that there is still plenty of scope for analytic flexibility. Specifically:
1. Normality is to be checked in a range of ways. Under what conditions will normality be presumed good enough to proceed? If it is not good enough, what will be the exact bootstrapping procedure?
2. Childhood adversity is to be measured using three IVs. If any one is significant in predicting a DV, will there be presumed to be a relationship between adversity and that DV? This gives one three shots at that conclusion. Either pick one main predictor or adjust with Bonferroni (etc) - and adjust the power calculation accordingly.
3. Specify exactly how demographic variables will be coded.
4. Specify exactly how ratings will be adjusted for baseline - e.g. will baseline ratings be entered as IVs?
5. For clarity, specify the full regression equation that will be used.
6. A lower-powered back up analysis is suggested by collapsing change scores into three categories. This gives another shot at the cherry. I suggest deleting this analysis.
7. Subjective effects will be measured in three different ways (feeling good, liking, feeling high). This gives three shots at getting the effect. I suggest averaging these ratings together (or else adjusting familywise error rate). Averaging will increase the reliability of the measure and give more power to detect a given raw effect size (i.e. difference in ratings units).
8. Determine what difference in rating units would be just meaningful, given the purpose to which the study could be used. How many units of feeling high is enough to care about? Put another way, a previous study found the bottom limit if the 95% CI for euphoria was 7 units on a 100 point scale. This corresponds to 0.7 units on a 10 point scale. Is this still enough to care about? (See p 10 here: https://psyarxiv.com/yc7s5/ ). If so, the fact that it is the bottom of a CI could be used to indicate it is roughly the lower limit of what is plausible; and if it is an effect one would care about, it is a minimal meaningful effect size that is just plausible. That means it is appropriate to be the effect size used for a power analysis. Note when converting from a raw to a standard effect size, take into account if the DV is averaged, which will increase the standardized effect size for a given raw effect size.
Minor point from Introduction: Why would a reduction in mu-opioid receptor density create heightened reward sensitivity (as it is associated with a reduced analgesic response to the drug)?
Zoltan Dienes