Close printable page
Recommendation

Neurocognitive insights on instructed extinction in the context of pain

ORCID_LOGO based on reviews by Tom Beckers, Gaëtan Mertens and Karita Ojala
A recommendation of:

Modulatory effects of instructions on extinction efficacy in appetitive and aversive learning: A registered report

Abstract

EN
AR
ES
FR
HI
JA
PT
RU
ZH-CN
Submission: posted 15 October 2022
Recommendation: posted 13 July 2023, validated 13 July 2023
Cite this recommendation as:
Chambers, C. (2023) Neurocognitive insights on instructed extinction in the context of pain. Peer Community in Registered Reports, . https://rr.peercommunityin.org/articles/rec?id=327

Recommendation

Rapid learning in response to pain is a crucial survival mechanism, relying on forming associations between cues in the environment and subsequent pain or injury. Existing evidence suggests that associations between conditioned stimuli (cues) and unconditioned aversive stimuli (such as pain) are learned faster than for appetitive stimuli that signal pain relief. In addition, when the link between a conditioned and unconditioned stimulus is broken (by unpairing them), the extinction of this learning effect is slower for aversive that appetitive stimuli, resulting in a flatter extinction slope. Understanding why extinction slopes are reduced for aversive stimuli is important for advancing theoretical models of learning, and for devising ways of increasing the slope (and thus facilitating extinction learning) could help develop more effective methods of pain relief, particularly in the treatment of chronic pain.
 
In the current programmatic submission, Busch et al. (2023) will undertake two Registered Reports to test whether a verbal instruction intervention that explicitly informs participants about contingency changes between conditioned and unconditioned stimuli facilitates extinction learning, especially for aversive (painful) stimuli, and how changes in extinction learning relate to neural biomarkers of functional connectivity. In the first Registered Report, they will initially seek to replicate previous findings including faster acquisition of aversive than appetitive conditioned stimuli as well as incomplete extinction of aversive conditioned stimuli without verbal instruction. They will then test how the instruction intervention alters extinction slopes and the completeness of extinction for appetitive and aversive stimuli, using a range of behavioral measures (expectancy and valence ratings) and physiological measures (pupillometry, skin conductance responses). To shed light on the neural correlates of these processes, in the second Registered Report the authors will use functional magnetic resonance imaging (fMRI) to ask firstly how acquisition and extinction of aversive and appetitive conditioned responses are related to resting state brain connectivity within a network that includes ventromedial prefrontal cortex, amygdala, and striatum, and secondly, whether the effectiveness of instruction on extinction learning is associated with differences in resting state connectivity across this network.
 
The Stage 1 manuscript was evaluated over two rounds of in-depth review. Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).
 
URL to the preregistered Stage 1 protocol: https://osf.io/cj75p (under temporary private embargo)
 
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.
 
List of eligible PCI RR-friendly journals:
 
References
 
1. Busch, L., Wiech, K., Gamer, M., Knicses, B., Spisak, T., Schmidt, K., & Bingel, U. (2023). Modulatory effects of instructions on extinction efficacy in appetitive and aversive learning: A registered report. In principle acceptance of Version 3 by Peer Community in Registered Reports. https://osf.io/cj75p
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Reviews

Evaluation round #2

DOI or URL of the report: https://osf.io/y2zxn?view_only=9384332360894f718ae1ba304b5e1c5d

Version of the report: 2 (PCIRR-Stage1-Revised-Manuscript-Instructed_Extinction.docx)

Author's Reply, 05 Jul 2023

Decision by ORCID_LOGO, posted 05 Jun 2023, validated 05 Jun 2023

Your revised manuscript has now been evaluated by the three reviewers from the previous round. The good news is that all are broadly satisfied and we are close to being able to award Stage 1 in-principle acceptance (IPA). As you will see, there are some remaining issues to address concerning clarification of the rationale/predictions and ROI definitions/labelling. Provided you are able to address these comprehensively in a final revision, I anticipate being able to award IPA without further in-depth Stage 1 review.

Reviewed by ORCID_LOGO, 02 Jun 2023

The authors have done a very careful and thorough job in addressing the reviewer's comments (mine but also those of the other 2 reviewers, as far as I can judge) and revising the manuscript. I do not see many remaining issues. One thing that still stands out, though, is the ambivalence regarding what the effect of instructions should be expected to be for the extinction of aversive versus appetitive learning. This is evident already in the abstract: the authors write initially that 'In a therapeutic context, this asymmetry [i.e., aversive learning being slower to extinguish than appetitive learning] that has been discussed as indicative of a ‘better safe than sorry strategy’ could potentially be overcome by making patients aware of the change in contingency' [implying that instructions should make extinction of aversive cues more like extinction of appetitive cues]. Yet a few lines down, they state that 'We expect [that the] effect [of instruction] is more pronounced for appetitive stimuli [than for aversive stimuli], which implies that instructions would further enhance the asymmetry between extinction of appetitive and aversive cues in the context of pain. It is also the latter prediction that they subscribe to in their response to point 3 of my initial review. Here they refer to the same 'better safe than sorry strategy' that is supposed to yield the initial asymmetry between extinction of appetitive versus aversive cues as the reason to expect a lesser effect of instruction, whereas the opening sentences of the abstract suggest that instructions are expected to counter exactly the effects of that strategy.

This ambivalence regarding the anticipated effect of instructions shines through elsewhere also. E.g., on p. 4, line 6-9, it is again suggested that instructions may be a promising method to prevent the incomplete extinction of aversive associations (implying that it would bring aversive extinciton more in line with appetitive extinction]. This conceptual confusion deserves wrinkling out, I think.

Other than that one issue, however, I think that the authors have done an excellent job and have dealt with all the comments in a satisfactory way. I would be happy to see this registered report advance to stage 2.

Tom Beckers

Reviewed by , 19 May 2023

After going through the revised manuscript and author replies in detail, I am happy with how the authors have responded to all of the reviewer comments and revised the manuscript accordingly. I commend the authors for their thorough work. 

The only remaining point I have pertains to the definitions of the regions-of-interest (ROIs) for the fMRI analyses, which I unfortunately overlooked in my first review - my apologies.
The manuscript states (p. 30, line 31-33 on tracked changes version): "SBFC analyses will use the left and right dlPFC and vmPFC, amygdala, and striatum as seeds, with masks derived from the FSL Harvard-Oxford Atlas...". 
However, dlPFC, vmPFC, or striatum do not exist as regions in the Harvard-Oxford Atlas (amygdala does). I would ask the authors to precisely define the regions from the Harvard-Oxford Atlas (or another atlas) that will be used to construct the specified ROIs, or to offer an alternative definition of the ROIs and a description of how the masks will be built.
As a side note of this point, striatum is mentioned in the manuscript for the first time as "ventral striatum" but all other mentions are to "striatum" only, so I am not certain if this means both ventral and dorsal striatum together, or is only used as a short form for ventral striatum - please clarify. Based on the extensive evidence that ventral and dorsal striatum have different functions, shown also in many fMRI conditioning/learning studies, it would make sense to differentiate them and (re)consider whether to include e.g. only ventral striatum, or both. 

Reviewed by , 08 May 2023

I'm satisfied with the authors' replies to my comments and I look forward to seeing the results of this RR.

Evaluation round #1

DOI or URL of the report: https://osf.io/spjh9?view_only=2ec6e1c7b79a4b45b53d33b4415cc7e2

Version of the report: 1 (PCIRR-Stage1-Snapshot-Instructed_Extinction.pdf)

Author's Reply, 05 May 2023

Decision by ORCID_LOGO, posted 16 Feb 2023, validated 16 Feb 2023

I have now obtained three very helpful and detailed expert reviews of your submission. As you will see, the reviewers find much to like in your proposal, which I agree is broadly very rigorous and tackles a valid and interesting research question. With RRs, the devil is always in the detail, and there are a number of areas requiring careful attention to satisfy the Stage 1 criteria -- including clarification and strengthening of the rationale (especially for the fMRI experiment but to varying extents throughout both planned studies), additional methodological detail in key areas to ensure reproduciblity and eliminate potential sources of bias (e.g. in relation to exclusion criteria, sampling plans, and analysis plans), and justification of specific design decisions.

Although there is some significant work to be done, the reviews are very constructive and I believe your manuscript is a promising candidate for eventual Stage 1 in-principle acceptance. On this basis, I would like to invite a major revision and point-by-point response to all comments, which I will return to the reviewers for another look.

Reviewed by ORCID_LOGO, 15 Feb 2023

I have read this stage 1 proposal with great interest. It addresses an interesting research question of basic and translational interest with overall sophisticated and appropriate methods. Whatever the study's results, they should be of interest to the field. However, whereas the overall rationale for the study is sound and convincing, quite a number of specific aspects of the design, procedures, hypotheses and analyses of the proposed study are less well justified or elaborated. As such, I think that there are a number of issues throughout the manuscript that merit addressing prior to the start of data collection. I list them below in order of appearance in the manuscript:

 

1.     The description of the prior study from which the current proposal takes inspiration is rather confusing to me (p. 3). First, on line 12 and beyond, I think CS and US have been switched; the increase and decrease in pain are USs, not CSs. More importantly, the description of the results of that study (line 17-21) on the one hand suggests that changes in differential CS valence ratings over the course of extinction training were similar for appetitive and aversive CSs (lines 17-19) but at the same time states that extinction of aversive CS valence ratings was incomplete (with an unstated implication that is was complete for appetitive CS valence; lines 19-21). It would be good to clarify this.

2.     Further down the introduction (p. 4), it is stated that Sevenster et al. (2012) showed that instructed extinction immediately abolished differential US expectancies but left SCR to the CS+ unaffected (lines 5-7). While factually correct, this is a bit misleading, given that differential SCR was completely absent from the first extinction trial in the instructed extinction group (see Sevenster et al., 2012, Figure 4). More broadly, I think there is little evidence in the literature to support the claim that instructed extinction is less effective for SCR than for US expectancies (in fact, even the claim by Sevenster et al., 2012, that it is less effective for fear-potentiated startle has been disputed).

3.     I don’t fully understand what the rationale is for predicting a stronger effect of instructions on extinction of the appetitive than the aversive CS (e.g., p. 5, lines 18-19, but also elsewhere). Given that without instructions, extinction is expected to be weaker/slower for the aversive CS, one would think that there is more room for instructions to facilitate extinction for that cue. The authors seem to be ambivalent about this as well, because further down in the manuscript, they make different predictions in this regard for US expectancies (see H4c on p. 24) than for CS valence (see H4d on p. 25), without further justification or discussion. I think this needs straightening out, given that testing this specific interaction between US type and instruction is the core raison d’etre for the proposed study’s factorial design. In that sense, it is also a bit strange to formulate this analysis as being exploratory; the authors are clearly intending to do confirmatory analyses to test the presence of an interaction and undoubtedly hope to draw directional conclusions from the results regarding the presence or absence of a difference in the effect of instructed extinction on appetitive versus aversive learning.

4.     I think the introduction in its present form does not provide sufficient justification for the inclusion of the various dependent variables that will be measured. The authors do state that the US expectancy ratings will be the primary measure of interest, but what are the valence ratings, SCRs and pupil dilation responses each supposed to add? Why are they included? And how will the authors handle divergence in results between these measures? Justification for the inclusion of SCR and pupil dilation in particular is not trivial. Is there good evidence that SCR and pupil dilation are appropriate measures here, in a design that involves a salient appetitive as well as a salient aversive US? Particularly given the nature of the proposed procedure, where trials start with a pre-US situation of moderate pain: The fact that the appetitive CS no longer signals a reduction in pain during extinction might be perceived as an aversive outcome in the appetitive group, which could support an increase rather than a decrease in SCR to the appetitive CS during extinction, which would not actually reflect a lack of learning. The same may be true for pupil dilation. This may all hinder interpretation of possible differences between the appetitive and the aversive CS during extinction. At a minimum, this warrants some justification/consideration. None of these issues would seem to plague the US expectancy ratings, for which a direct comparison of responding to the appetitive and the aversive CS seems much more straightforward.

5.     Relatedly, no clear justification is provided for including only US expectancy as a measure of conditioning in the analyses for the second manuscript. 

6.     Regarding the sample size and stopping rule, two issues warrant elaboration. First, I think it would be more appropriate to stop data collection if for all hypotheses a BF10 of either 6 or 1/6 is reached, rather than a BF > 6 for all hypotheses, so as to not bias data collection towards positive results. Second, it isn’t clear how the authors, starting from the effect size in Sevenster et al. (2012), arrived at their intended sample size of 150 (p. 14). This deserves elaboration. 

7.     Reinstatement of conditioned responding (or recovery of extinguished responding in general) isn’t really covered in the introduction; as a result, the inclusion of a reinstatement phase in the experiment feels like it is lacking a clear rationale.

8.     The authors list an appropriately ordered difference in US painfulness ratings as manipulation check. Fair as this may be, I think a more relevant positive control would be the observation of differential acquisition for all measures and for both types of US by the end of acquisition.

9.     The US is variably described as an increase/decrease/constant temperature, or as a pain exacerbation / pain decrease / no change in pain. One is a way to achieve the second, obviously, but it would help clarity if the authors used a consistent terminology for what the USs are throughout.

10.  Participants are excluded if they have recently participated in pharmacological studies (p. 14), but can they have taken part in conditioning/extinction experiments before? It seems like that might affect their speed of acquisition and extinction learning rather substantially (cf. the literature on re-acquisition and re-extinction).

11.  It is a bit odd that the label for the US expectancy scale reads ‘most probably cooling decrease’ on the one side and ‘most probably heating’ on the other side (p. 16, line 11-12). Why not just ‘most probably cooling’ and ‘most probably heating’ (or, alternatively, ‘most probably cooling decrease’ and ‘most probably heating increase’)?

12.  On p. 17, covariates are introduced that haven’t been mentioned previously. Will these questionnaire scores be used for screening out participants only, or will they also be used for additional analyses on included participants?

13.  I found the description of the calibration procedure on p. 20 difficult to follow. In particular the follow sentence I failed to make sense of (lines 2-3): “This second step will consist of the application of the selected temperature level within a range of -1.5 to and +3.0°C in steps of 0.5°C”.

14.  Regarding the trial structure: the intertrial interval of 4-7 s seems unlikely to be sufficient for unconditioned skin conductance responses to the US to return to baseline. An longer interval seems indicated.

15.  Why does the extinction phase involve a slightly smaller number of trials per CS than the acquisition phase? This also translates in different numbers of trials in between US expectancy ratings, for instance, and more generally makes the analyses a bit difficult to compare between acquisition and extinction. Why not simply have 16 trials per CS in each phase?

16.  Regarding the analyses, I found the use of ‘time’ for the independent variable of trial a little unfortunate, given that also ‘time bin’ is used as an IV in some analyses, and time in the latter case refers to within-trial time, whereas in the former case it refers to a much larger scale. ‘Trial’ would probably be a clearer descriptor than ‘time’.

17.  SCR will be divided in FIR and SIR. It would be good to provide a rationale for this (or, alternatively, to not distinguish between FIR and SIR).

18.  Regarding the second-level connectivity analyses (p. 28), I would have expected that connectivity would be used as a predictor and acquisition and extinction indices as outcomes (lines 23-25).

19.  Regarding the same analyses, it isn’t clear whether also hypotheses regarding acquisition will be tested – the title on line 16 does suggest so, but on line 31-32, only hypotheses regarding extinction are mentioned.

20.  On the next page, to test the direct effect of instruction, US expectancy immediately before and after the instruction will be compared for the instructed group only (lines 5-7). This seems a bit odd. Wouldn’t it make more sense to compare the difference in US expectancy from Acq5 to Ext1 between the instructed and the non-instructed group? Likewise, to evaluate the effect of instruction on learning through experience, it would seem more sensible to compare the difference in US expectancy from the end of acquisition to the end of extinction between both groups. Confusingly, further down (lines 19-23), the proposed test of the hypothesis does involve an interaction with instruction group, in contradiction to the preceding section.

21.  Regarding the pilot study (p. 30), the authors indicate that one participant was excluded due to poor data quality. It would be good to know how poor data quality is defined, given that this may happen for the proposed study is well and should perhaps be mentioned as ground for exclusion.

22.  While discussing the results of the pilot study, the authors indicate on p. 31 that, whereas the mean expectancy for the CSdecrease returned to zero the mean expectancy for the CSincrease remained elevated at the end of extinction. I’m not certain that that is clear from Figure 2, in particular when considered relative to responding to CSmedium.

 

Tom Beckers

Reviewed by , 06 Feb 2023

I’ve read the RR1 manuscript titled “Modulatory effects of instructions on extinction efficacy in appetitive and aversive learning: A registered report” by dr. Busch and colleagues carefully. In the RR proposal, the authors want to investigate the effects of verbal instructions on aversive and appetitive conditioning, using heat pain (or pain relief) stimulation. For several outcome measures (i.e., CS expectancy ratings, CS valence ratings, pupil dilation and skin conductance responses) the effects of instructed extinction and aversive vs. appetitive conditioning will be assessed.

All in all, I think this is a good research proposal on a topic that, due to the methods involved (e.g., psychophysiological measures), usually suffers from restricted sample sizes. Therefore, I believe it is a good thing that a well-powered RR will be conducted on this topic. Furthermore, the report is well-written and the authors seem experts on the involved methods. As such, I do not have many things to add.

My only more major comment is related to the different dependent variables (i.e., CS expectancy ratings, CS valence ratings, pupil dilation and skin conductance responses) and how differences in results between these DVs should be interpreted. What if the CS type X time interaction during acquisition or extinction is significant for one DV, but not for the authors. How should the hypothesis then be interpreted? I think that, in principle, the different DVs test the same hypothesis (e.g., steeper extinction slopes for appetitive than for aversive CSs). Therefore, I believe that correction for multiple testing should probably be applied. In that case, if a significant effect (using an adjusted alpha-level) is observed for any of the DVs, this effect can be interpreted as supporting the hypothesis.

Smaller comments:

-          Title: I struggle a bit with the terminology, because typically when considering appetitive conditioning, I think of things like pairing CSs with chocolate or erotic pictures (van den Akker et al., 2017). What the authors do in their paradigm seems more akin to relief learning (i.e., relief from a painful stimulus). I am not entirely sure whether this is the same thing as appetitive conditioning. However, I do not have good recommendations for the authors to change their terminology (except for maybe “pain” and “pain relief” learning). 

-          P. 4: I believe that Sevenster et al. (2012) observed a lack of effects of instructions on the startle response, rather than skin conductance responses. And even this interpretation is somewhat dubious, because in their follow-up tests, Sevenster et al. (2012) observed facilitated extinction in the instructed extinction group with startle as well. Indeed, the literature indicates mostly ubiquitous effects of verbal instructions with different DVs (Atlas & Phelps, 2018; Costa et al., 2015; Mertens et al., 2018; Mertens & De Houwer, 2016).

-          Fig. 1: Perhaps this figure can be a bit reorganized (particularly for any eventual publications) to make the size of the left panel figure larger (e.g., by putting the two images below one another, rather than next to each other).

-          P. 23: Regarding the covariate analyses including US painfulness ratings, gender, age, etc. Are these really needed? Perhaps in an exploratory sense, they could be interesting. However, adding them to the main analyses based on model improvement seems not needed and could complicated the interpretation of the results in my view. Particularly, due to randomization, any systematic effects of these covariates should be nullified. Furthermore, when effects are reported including the covariates, it may be hard to gauge for readers whether effects crucially depend on the inclusion of the covariates. Hence, for ease of interpretation, I would recommend to simply not include these covariates in the main analyses.

I am not an expert in fMRI analyses, so unfortunately, I could not really evaluate the appropriateness of the analyses. However, at first sight, I think the analyses seem appropriate and I believe that the preregistration of the analyses pipeline in this RR is very valuable, given the many degrees of freedom in analyzing fMRI datasets (Botvinik-Nezer et al., 2020).

 

References

Atlas, L. Y., & Phelps, E. A. (2018). Prepared stimuli enhance aversive learning without weakening the impact of verbal instructions. Learning & Memory, 25(2), 100–104. https://doi.org/10.1101/lm.046359.117

Botvinik-Nezer, R., Holzmeister, F., Camerer, C. F., Dreber, A., Huber, J., Johannesson, M., Kirchler, M., Iwanir, R., Mumford, J. A., Adcock, R. A., Avesani, P., Baczkowski, B. M., Bajracharya, A., Bakst, L., Ball, S., Barilari, M., Bault, N., Beaton, D., Beitner, J., … Schonberg, T. (2020). Variability in the analysis of a single neuroimaging dataset by many teams. Nature, 582(7810), 84–88. https://doi.org/10.1038/s41586-020-2314-9

Costa, V. D., Bradley, M. M., & Lang, P. J. (2015). From threat to safety: Instructed reversal of defensive reactions. Psychophysiology, 52(3), 325–332. https://doi.org/10.1111/psyp.12359

Mertens, G., Boddez, Y., Sevenster, D., Engelhard, I. M., & De Houwer, J. (2018). A review on the effects of verbal instructions in human fear conditioning: Empirical findings, theoretical considerations, and future directions. Biological Psychology, 137, 49–64. https://doi.org/10.1016/j.biopsycho.2018.07.002

Mertens, G., & De Houwer, J. (2016). Potentiation of the startle reflex is in line with contingency reversal instructions rather than the conditioning history. Biological Psychology, 113, 91–99. https://doi.org/10.1016/j.biopsycho.2015.11.014

van den Akker, K., Schyns, G., & Jansen, A. (2017). Altered appetitive conditioning in overweight and obese women. Behaviour Research and Therapy, 99, 78–88. https://doi.org/10.1016/j.brat.2017.09.006

  

Reviewed by , 15 Feb 2023

This registered report submission outlines a study to answer the research question of whether verbal instruction may make extinction more efficient during appetitive (pain relief) than aversive (pain exacerbation) learning, additionally investigating associations of learning and extinction indices with pre-task resting-state fMRI connectivity between regions-of-interest and the rest of the brain. Majority of the submission is very clearly written and extremely thorough. In most aspects, it is an excellent registered report for a well-planned study. I am satisfied with the behavioral and psychophysiological section of the submission and would accept the submission almost as-is regarding the parts intended for Manuscript 1, with only minor points to address. However, I have some central criticisms pertaining to the research questions, theoretical background and hypotheses posed in the resting-state fMRI section intended for Manuscript 2, which I think should be resolved before acceptance. 

 

1A. The scientific validity of the research question(s)

  • The scientific validity of the stated research questions for the non-MRI part of the submission intended for Manuscript 1 is good and I have no comments on the research questions themselves. There are only a couple minor things I would like to point out about the background provided:
    • Introduction, p. 4, paragraph on common neural systems for aversive and appetitive learning mechanisms: all literature used to support claim “Together, these studies suggest a common neural system for appetitive and aversive learning mechanisms” used only secondary (monetary) reinforcer and not primary appetitive reinforcer such as the pain relief in this study plan. Especially with talk of “biological relevance”, I think this is a rather important distinction and should be a qualifier (and otherwise, what is this paragraph trying to say?). There is also a rather extensive animal literature on the neural basis and/or commonalities of aversive and appetitive learning, which is not cited at all.  
  • The research question for resting-state fMRI component intended for Manuscript 2 is written as follows (p. 7, Introduction): “identify functional connectivity-based brain markers assessed with resting state fMRI acquired prior to task performance that are associated with an individual’s aversive and appetitive learning during acquisition and extinction, and the effect of the instruction”. This research question is not proposed well in the submission in relation to existing theory and how relevant the question may be for the field. 
    • Firstly, the theoretical background for the research question is not presented in a convincing manner. For example, p. 4 of Introduction, the sentence “Changes in resting-state functional connectivity of the amygdala after acquisition learning have been reported (Schultz et al., 2012)” does not specify what kind of changes were found. Why would it be relevant that there is any change in functional connectivity? Moreover, the sentence “Connectivity of the amygdala has been shown to be clinically relevant for the prediction of treatment outcome (Klumpp et al., 2014)” does not specify what kind of connectivity of the amygdala and with what other brain region/network, and what clinical condition and treatment were involved, and why this is relevant for the research question at hand. Finally, “Individual aversive acquisition learning (Kincses et al., 2023), and extinction learning (Belleau et al., 2018) were associated with brain connectivity, and connectivity changes” is extremely unspecific. Associated how to what indices of acquisition of extinction learning (the strength or speed of learning, or something else?), what brain connectivity, and what connectivity changes?
    • Secondly, it is obvious that any kind of learning is associated with changes in the brain so it would be important to try to device studies that can precisely answer questions such as “associated how”, “what kind of changes”, “where exactly do the changes occur” and “are these changes actually important for the learning”. It is not clear from the submission how answering the above research question would advance our understanding of the main phenomenon under study, i.e. influence of instructions on extinction learning in the context of conditioned expectation of pain exacerbation and relief, and their neural mechanisms. 

 

1B. The logic, rationale, and plausibility of the proposed hypotheses (where a submission proposes hypotheses)

  • The hypotheses for Manuscript 1 are specified well and linked to a sound theoretical background.
    • Minor point. In Table 1, last row, column “Theory that could be shown wrong by the outcomes” (p. 9-10), it is stated that “H4d: The interaction could show that instructed extinction can in fact affect CS valence ratings, as opposed to the interpretation by Luck and Lipp (2016), which would be in line with single-process accounts of fear learning (e.g., Brewer, 1974; Mitchell et al., 2009), which suggest a common basis for affective and expectancy learning”. Even if the outcome may contrast with the evidence from previous studies reviewed in Luck and Lipp (2016), I am confused as to why the authors think that CS valence being affected by instructions would support the single-process account (and therefore, provide evidence against the dual-process account) of fear learning. The arbitration between single- and dual-process accounts hinges on whether fear learning in humans relies on forming conscious expectations of aversive outcomes (measured via US expectancy ratings), or whether it can manifest in two independent learning processes, where in addition to conscious contingency learning there is lower-level learning (usually considered to be reflected in physiological CR) that may be outside conscious awareness and possibly not influenced by verbal instructions. Unless the authors argue that CS valence ratings can be taken as an example of the latter type of learning even if rating CS valence involves explicit reporting and therefore consciously accessing valence representations, which I would contend with, the outcome of this analysis does not in fact offer any unambiguous support for the single-process account. Of course, I may have misunderstood the argument but in that case, I would ask the authors to clarify it in the text.
  • Since the research question posed in the resting-state fMRI part of the submission is not well-specified and/or grounded in theory, the associated hypotheses are also vague. The hypotheses are written in the form: “significant associations of an effect of interest (e.g., acquisition index, effect of instructions on extinction efficacy) with the resting-state connectivity between seed ROIs (left and right dlPFC and vmPFC, amygdala, and striatum), and the rest of the voxels in the brain”.
    • Significant association as per which metric precisely, and in which direction? E.g., an entirely hypothetical example for a directional, more precise hypothesis: higher functional connectivity defined as Pearson's correlation (or other measure) between amygdala and vmPFC (or a specific resting-state network, e.g., the salience network) measured before task is expected to be associated with higher extinction efficiency during the task.
    • If specific hypotheses are not justified by previous literature/existing theory, this part of the submission should be introduced as highly exploratory, and the final form of Manuscript 2 should also reflect this.
    • Additional minor point for H1+2 of Manuscript 2 (p. 6, from line 24): Each individual’s acquisition and extinction of CR quantified only as the slope of US expectancy, excluding the other measured (e.g. SCR, pupil size). Since this is presumably has to do with the intent of studying the impact of verbal instruction, it would be good to state clearly that US expectancy is used here to quantify the CR as it is likely the measure most influenced by verbal instruction. 

 

1C. The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis or alternative sampling plans where applicable)

  • The methodology and analysis pipeline for the behavioral and psychophysiological analyses for the analyses intended for Manuscript 1 seem sound and feasible.
    • Minor point for Table 1, first row (manipulation check US type), column “Interpretation given different outcomes”: “Relevant effect: A statistically significant main effect of the factor US type, indicating that ratings are higher for the USincrease than the USmedium, and higher ratings for the USmedium than the USdecrease …” – To be precise, a main effect does not in fact indicate that USincrease > USmedium > USdecrease. It only indicates that there is a significant mean difference overall between at least some of the levels of this factor. I am sure the authors know this as post-hoc comparisons are mentioned elsewhere but the statement here should be corrected. 
  • I cannot comment precisely on the details of the resting-state analyses intended for Manuscript 2 as I am not an expert in resting-state fMRI. The outlined preprocessing steps seem sound for fMRI data analysis in general. 
  • Since the sampling is based on Bayes Factor plus maximal sample size as stopping criterion, either the minimum sample size to be collected should also be defined (and be substantial enough to mitigate the issue of possible false positive evidence when reaching the evidence threshold after only very few participants since “most misleading evidence happens at early terminations of a sequential design”) or a very high evidence threshold used (e.g. BF10 ≥ 30; see Schönbrodt & Wagenmakers, 2018, section “Sequential Bayes factor with maximal n: SBF+maxN”). Note that if the authors want to be able to claim absence of evidence, i.e. support for null hypothesis, a Bayes Factor stopping criterion for H0 should also be set (as the sample size needed to reach strong enough evidence for H1 and H0 can be different). Moreover, explaining the protocol of data checking for whether stopping criterion is fulfilled should be included in the submission.
    • Possible minor mistake in the sampling plan in Table 2, first row, p. 11: Is it correct here that 75 participants per group mentioned, or is full N = 150 used for these analyses without group separation?

 

1D. Whether the clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures and analysis pipeline and to prevent undisclosed flexibility in the procedures and analyses

  • The methodological detail is largely excellent. The exceptions to this are:
    • Section 2.4.3.2. Acquisition training (p. 20, line 20): It would be good to refine the description here a bit more to unambiguously state that the participants received only instruction about the existence of contingencies but not of the actual contingencies themselves (and therefore, participants had to learn the contingencies during acquisition through experience), to avoid misunderstanding since this was an instructed conditioning study but (as far as I understood) the actual contingency instruction was only given for the extinction phase.
    • Section 2.5.2.2. Pupillometry data (p. 25, line 10-11): “We will apply a correction to account for multiple comparisons”. What multiple comparisons correction will be used and to what tests will it be applied? This should be mentioned for all analyses.
    • The resting-state fMRI analysis section does not detail how the correlation maps are obtained with CONN toolbox: “… first-level analyses in the CONN toolbox. We will use the rsfMRI scan to derive correlation maps between the respective seeds and all other brain voxels”. For those readers who are not experts in using the specific toolbox, it is not evident at all how the analysis is done, what options might be used, etc. Presumably, there are degrees of freedom to how these analyses can be conducted.
    • It is not clear why the authors chose the specific threshold for probabilistic threshold-free cluster enhancement z-score image false discovery rate (q < .02). 

 

1E. Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls; other quality checks) for ensuring that the obtained results are able to test the stated hypotheses or answer the stated research question(s).

  • Yes. Very minor point: I am sure the authors intend to do this as part of quality checks even if it was not mentioned, but in addition to checking the US type effect for US painfulness and US unpleasantness ratings, it should also be checked that there is a reliable US response for SCR and pupil size. The manipulation check would be USincrease > USmedium for the aversive side, while USdecrease > USmedium is possible in the appetitive case (in contrast to USmedium > USdecrease for the rating measures) due to the valence-independence of SCR and pupil size.