Are there oscillatory markers of pain intensity?

ORCID_LOGO based on reviews by Markus Ploner and Björn Horing
A recommendation of:

The effect of stimulus saliency on the modulation of pain-related ongoing neural oscillations: a Registered Report


Submission: posted 06 September 2023
Recommendation: posted 16 November 2023, validated 16 November 2023
Cite this recommendation as:
Dienes, Z. (2023) Are there oscillatory markers of pain intensity?. Peer Community in Registered Reports, .


Rhythmic changes in pain can lead to corresponding modulations of EEG amplitudes in theta, alpha, and beta bands. But the question remains open as to whether these modulations are actually tracking pain, or maybe rather saliency or stimulus intensity. The question is of some importance because a marker of pain per se could be useful for tracking felt pain without a verbal response, and could be useful in investigating interventions for treating pain (such as suggestion).  Here, Leu et al. (2023) will address the question of whether modulations reflect saliency or else the intensity of pain, by using an oddball paradigm in which most trials are a pain stimulus of a certain intensity, and oddball trials will sometimes occur, at either a higher intensity or a lower intensity than the baseline ones. If the modulations reflect salience, the modulation at the frequency of the oddball will be similar for high and low intensity oddballs. However, if the modulations reflect pain intensity, the modulations for the low rather than high oddball condition will be lower.
The Stage 1 manuscript was evaluated over three rounds of in-depth peer review, the first two consisting of substantial comments from two scholars with relevant expertise, and the third consisting of a close review by the recommender. Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.
List of eligible PCI RR-friendly journals:
1. Leu, C., Forest, S., Legrain, V., & Liberati, G. (2023). The effect of stimulus saliency on the modulation of pain-related ongoing neural oscillations: a Registered Report. In principle acceptance of Version 4 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #3

DOI or URL of the report:

Version of the report: 3

Author's Reply, 15 Nov 2023

Decision by ORCID_LOGO, posted 14 Nov 2023, validated 14 Nov 2023

Thank you for your revision. The exact analyses and conclusions still need to be clearer for row 1 of the design table. I believe you are interested in the simple effects of stimulus for each condition. The presence or absence of an interaction seems irrelevant to your conclusion; and a main effect of stimulus is not fully informative - right? Why not pre-register just two tests: The simple effect of stimulus (oddball vs baseline) for each condition (high vs low odd ball). If you do this, power should be calculated for those tests (even if it is not what one would ideally want, it should be known). If you fail to get evidence for any one, it is unclear you expect modulation of oscillations for the corresponding condition. You say "we still COULD get modulations if there is a dissociation". Yes, you could. But you also have an out, if you don't get modulations: The theory that there will be modulations is not weakened because why SHOULD there be modulations if there was no difference in perception? So I would delete the prevarication. If you don't find both simple effects, the test is simply not a severe one of the theory there will be modulations. By second test, do you mean second visit?

Evaluation round #2

DOI or URL of the report:

Version of the report: 2

Author's Reply, 14 Nov 2023

Decision by ORCID_LOGO, posted 13 Nov 2023, validated 13 Nov 2023

The reviewers are almost entirely happy with your extensive revisions; they make a few minor points still for you to consider. A further point: for row 1 of your table, spell out the inferential chain a bit more explicitly; e.g. first you will test the interaction; then if significant, you do two pairwise comparisons; then what follows if each of them is or is not significant? Make your reasoning process completely clear in advance.



Reviewed by , 05 Nov 2023

First of all, I thank the authors for having invested a lot of effort, which they hopefully felt as improving and not as complicating the proposal. Second of all, my remaining concerns are subordinate to several protocol constraints pointed out by the authors: 

For example, I understand that while it is thinkable to use different temperatures for the current [0 +3 -3] configuration, like [0 +3 +1.5], discriminability is counteracted by tolerability. The authors concur that the protocol might depend on the actually non-painful quality associated with the low oddball (or even _relief_, by not reaching the painful temperature implicitly predicted by the subject). I would only expect them to discuss this dimension when interpreting their data.

As another example, I feel like there is not a conceptual but an effective contradiction in the replies to my comments
- I.1 ("the baseline frequency may be reinforced..."), asserting that there is no differential influence on the baseline frequency by high/low oddball (in the sense of a direct effect of the oddball harmonics), and 
- II.2 ("Another section reads..."), acknowledging that "Given that the baseline stimuli could be perceived at different levels of painfulness given the condition, it is conceivable that this could have an influence on the modulation at the baseline frequency" (in the sense of an indirect, carry-over effect of the different temperatures)

This is not existential for the experiment as long as, again, any interpretation is tempered by possible alternative explanations.

In conclusion, I am satisfied with the proposal, and I am looking forward to the results of the study.

Reviewed by ORCID_LOGO, 13 Nov 2023

On page 14, there is still an error in the definition of the beta band. Otherwise, my comments have been convincingly addressed. I wish the authors good luck with their study.

Evaluation round #1

DOI or URL of the report:

Version of the report: 1

Author's Reply, 31 Oct 2023

Decision by ORCID_LOGO, posted 22 Oct 2023, validated 22 Oct 2023

Thank you for your well written paper. I have now received thorough comments from two expert reviewers. I will make further comments to do with the Registered Reports format in particular.

1) Make sure there is no analytic and inferential flexibility left; that is, anyone following your description would reach identical conclusions from your data.

For example:

p12 "the average amplitude of the signal measured at 2-5  neighboring frequencies to remove  residual noise"
There is analytic flexibility here; decide precisely what you will do in advance

p13 "and the harmonic with the largest modulation (i.e.,Wilcoxon signed-rank test statistic)"
What will you do if the test is non-significant? Why not just select the harmonic with the largest modulation?

In the Design Table, indicate exactly which test you are using to make inferences. In Row 1 you indicate a two factor model: Will the inference be based on just the interaction? On the interaction plus follow-up simple effects? Which simple effects? Here and elsewhere tie down your inferential chain *exactly*. Ask yourself for each row: Is someone else guaranteed to come to one single conclusion, the same as yourself, given the specifications here?

2) Power should be calculated with respect to the effect you do not wish to miss out on detecting, in order to control Type II errors rates. If one calculates power with respect to an average past effect, it means you have not controlled for missing out on interesting effects less than this. For some guidance on thinking about this see:




Reviewed by , 08 Oct 2023

The proposal titled "The effect of stimulus saliency on the modulation of pain-related ongoing neural oscillations: a Registered Report" by Leu, Forest, Legrain and Liberati lays out an experimental protocol to disentangle various aspects of the processing of noxious stimuli. These include stimulus intensity processing, stimulus saliency, and ultimately the perception of pain. The protocol uses behavioral readouts (continuous ratings) and electroencephalography (EEG) using frequency-tagging. 

Previously, I have been reviewing the first iteration of the protocol; I now find that several of my concerns regarding the earlier draft have been adequately addressed. A number of issues remain, however (some old, some new). I have carefully worked through the manuscript and will list my comments individually or under each relevant stipulation listed at . 

Note that there is no perfect protocol and I believe the authors' approach has merit regardless of whether all concerns can be fully addressed. 

= = = = = = = = = = = 
_Individual major concerns_

The authors suggest an oddball paradigm where a baseline heat stimulation occurs at a higher frequency (sinusoidal stimulation at 0.5Hz). At a lower frequency (0.125Hz), oddball stimuli are interspersed in two variants (conditions), namely higher-than-baseline intensity, or lower-than-baseline intensity. Core hypotheses concern the relationship of the oddballs to each other, with different interpretations depending on whether the higher oddball is accompanied by higher EEG/ratings than the lower, or not.

I have two concerns. 

I. The first concern is hard to address but should at least be discussed as limitation, or maybe the authors have an idea to actually solve it (e.g. two different suprathreshold/high oddballs): The high oddball _encompasses_ the baseline stimulation (i.e. reaches baseline intensity, then goes above and beyond it; baseline+), whereas the low oddball does not (i.e. is not even baseline-). This could mean (among other possibilities) that
- the baseline frequency may be reinforced/more sustained in the high oddball (or vice versa disrupted in the low oddball), which might overestimate low oddball contributions where baseline/oddball contrasts are involded (but see concern II)
- there may be a qualitative difference (the most obvious concern being pain threshold-related, cf Supplementary Figure 2 where VAS dips below pain threshold) between the oddballs that would then make an attribution to simple intensity differences harder.

II. The second concern is: Why is an oddball paradigm needed at all for this comparison (it is mentioned only in adjunct hypothesis of the Hypothesis Table in Supplementary Materials)? In other words, what function does the baseline frequency have/what does it add in terms of elaborating the differences between high oddball and low oddball (EEG power, or rating)? Removing the baseline frequency altogether, the oddballs would then become mere stimuli, without any loss of internal consistency. There are several possible directions to explain why the oddball might be required (instead of a simple frequency-tagged intensity comparison, which would of course have its own weaknesses but I am not sure how the oddball overcomes it), which should be explicitly mentioned to motivate the oddball paradigm.
- Hypothesis Table, Hypothesis 1 and 3a states that the baseline serves _as a positive control_ - but wouldn't just looking at the oddball frequency (or its absence) alone also allow the same conclusion whether frequency-tagging worked?
- Another section reads "As for the phase-locked response, the difference between baseline and oddball will be calculated for each condition and frequency band. Then, for each frequency band, a paired t-test will be employed to compare the peaks related to .high and .low. If the intensity of the stimulus is the main factor in the modulation of ongoing oscillations, the amplitude for .high will be larger than the amplitude for .low. If saliency is more relevant than intensity, the amplitudes of the oddball in the normal and the control condition will be similar to each other." This seems to suggest that the authors expect baseline to be different between the conditions (if it was identical, it would not need to be considered), but they do not explain why (cf. the following point).
- Finally, using the oddball paradigm may alleviate some issues I brought up at concern I, but this has not been pointed out clearly/is at best implicit.

In conclusions, it is unclear to me how the baseline aspect (and vice versa, using an oddball per se) is motivated. There may be good reasons, but they are not stated clearly enough. Does sensory entrainment with an oddball paradigm make the oddball frequency (& harmonics) somehow more robust, or the comparison more sensitive, or anything like that? If so, this crucial aspect should be mentioned as a rationale for using an oddball paradigm to begin with. Or does the oddball modulate the baseline frequencies, and comparing their modulation by high versus low is the actual question here? If so, this aspect may have been omitted.

= = = = = = = = = = =  
_Individual minor suggestions_

- I suggest considering time as a predictor to account for the inevitable decrease in attention/vigilance in a not very eventful protocol (during EEG it's simply passive perception over at least 24*80=1920s=32min, albeit with pauses)
- Rating-peaks (i.e. maxima of continuous ratings) can probably be considered more generously than the proposed 1.35 to 2s, for example from 1 to 3s after stimulus-peak), because the subsequent stimulus-peak's maximum should only be reached at the earliest after 2+1.35=3.35s (if I understand correctly); that said, not only Mulders 2020 should be considered but also their own pilot data, where rating-peaks were reached ~1s (not 1.35s) after stimulus-peaks.
- Is no fixation cross (or some gaze target beyond "keeping it steady") employed?
= = = = = = = = = = =  
_Does the research question make sense in light of the theory or applications? Is it clearly defined? Where the proposal includes hypotheses, are the hypotheses capable of answering the research question?_
Yes. As for "clearly defined", I would call out a few instances that could be clearer, conceptually or linguistically.

Generally consider checking for language, e.g. confusing sentences like "Painful stimuli emerge from the activity of the nociceptive system which is made to respond to high-intensity and potentially damaging somatosensory stimuli and are therefore inherently salient and to facilitate involuntarily capture of attention (Eccleston & Crombez, 1999)."

Please use consistent/recurrent/recognizable nomenclature and concepts. For example, 
- sometimes you say "salience or intensity", sometimes "saliency or painfulness", sometimes all three, when you basically refer to the same issue (i.e. distinguishing these facets)
- sometimes you say "normal" and "control" oddballs or conditions, sometimes "high" and "low"
- sometimes you write "base", sometimes "baseline" (I think as FoS-Tag)
- "Based on the assumption that the high oddball will be more salient than the baseline stimuli, we expect that they will be perceived as more intense than the baseline stimuli." is begging the question of the relationship between saliency and intensity (or intensity perception, or pain? unclear)

= = = = = = = = = = = 
_Is the protocol sufficiently detailed to enable replication by an expert in the field, and to close off sources of undisclosed procedural or analytic flexibility?_

Yes, mostly. The experimental procedure could be described more completely, for example, "For each condition (i.e., normal / control), 12 trials will be delivered distributed over 6 blocks of 4 thermonociceptive stimulation trials (Figure 3).": This suggests that breaks between _blocks_ (i.e. quadruplets of counterbalanced High|High|Low|Low oddballs) are self-paced between 2 and 5 minutes, but what about breaks between _trials_ (ITI between the High=>High, for example)? Also please indicate an overall duration in 2.5 not just in 2.1, which should be around 30min for EEG setup, plus 24*80s=1920s=32min for stimulation, plus between 5*2=10mins and 5*5=25mins for inter-block-breaks, plus 6*3*X=Xmin unspecified inter-trial-intervals.

I am having some issues following the calibration procedure. In the manuscript proper, the authors write that subjects should experience pain "throughout the entirety of each trial"/"throughout the 40s trial", elsewhere they write that subjects should "overall" (which I read as "generally but not always") experience pain during the trial (legend Figure 3). The latter makes sense considering that the sinusoids always start at 35°C, so _none_ of the trials will ever be painful "throughout their entirety" (even if sensitization sets in immediately after peak 1 and all 35° troughs are experienced as painful - which is also unrealistic -, the first ramp will not be). Do the authors mean to say that "Participants will be asked whether they perceived the stimulation AT THE PEAKS as painful throughout the 40s trial"? Operationalization should be crystal clear here. Relatedly, how are subjects instructed what "painful" means, and is the intention to achieve a certain degree of pain (e.g. VAS8 at peaks), or just painfulness (i.e. at least pain threshold/VAS5 at peaks)?

I do not understand what is meant by "change in perception of the painfulness twice in a row"? That in two consecutive 40s trials, subjects afterwards report an increase in the 3-category ratings system offered ("no pain overall", "only painful in first half of trial", or "painful throughout the trial")? What if category 2 is never chosen, or the second half of the trial is painful?

= = = = = = = = = = = 
_Is there an exact mapping between the theory, hypotheses, sampling plan (e.g. power analysis, where applicable), preregistered statistical tests, and possible interpretations given different outcomes?_


= = = = = = = = = = = 
_For proposals that test hypotheses, have the authors explained precisely which outcomes will confirm or disconfirm their predictions?_
Yes, mostly. I was wondering if 4b and 6b are meant to be positive controls, as well - if frequency-tagging works with the high oddball but not the low, surely there is nothing to compare?

= = = = = = = = = = = 
_Is the sample size sufficient to provide informative results?_

= = = = = = = = = = = 
_Where the proposal involves statistical hypothesis testing, does the sampling plan for each hypothesis propose a realistic and well justified estimate of the effect size?_


= = = = = = = = = = = 
_Have the authors avoided the common pitfall of relying on conventional null hypothesis significance testing to conclude evidence of absence from null results? Where the authors intend to interpret a negative result as evidence that an effect is absent, have authors proposed an inferential method that is capable of drawing such a conclusion, such as Bayesian hypothesis testing or frequentist equivalence testing?_
Yes, well, mostly. I am having some issues with the way some interpretations are phrased concerning whether hypotheses are rejected or not. For example, "A similar amplitude of the oddball in the high and low oddball condition would show that the oddball response is mainly driven by the saliency of the stimulus. If the oddball in the low oddball condition would lead to a smaller response compared to the oddball in the high oddball condition it would indicate that the oddball the intensity of the stimulus is responsible for the peak related to the oddball." is quite assertive as to the mechanisms involved. I would prefer more epistemologically cautious phrases like "supports a role of" or "suggests that XYZ is not reflected in the signal" or so. Saliency and intensity are not the only thinkable contributors to either readout of this study.

= = = = = = = = = = = 
_Have the authors minimised all discussion of post hoc exploratory analyses, apart from those that must be explained to justify specific design features? Maintaining this clear distinction at Stage 1 can prevent exploratory analyses at Stage 2 being inadvertently presented as pre-planned._


= = = = = = = = = = = 
_Have the authors clearly distinguished work that has already been done (e.g. preliminary studies and data analyses) from work yet to be done?_
Yes. (as a minor aside, I was wondering, why do the ratings in Suppl Fig 1 go up again at the end of the trial?)

= = = = = = = = = = = 
_Have the authors prespecified positive controls, manipulation checks or other data quality checks? If not, have they justified why such tests are either infeasible or unnecessary? Is the design sufficiently well controlled in all other respects?_

Yes they have specified positive controls.

= = = = = = = = = = = 
_When proposing positive controls or other data quality checks that rely on inferential testing, have the authors included a statistical sampling plan that is sufficient in terms of statistical power or evidential strength?_


= = = = = = = = = = = 
_Does the proposed research fall within established ethical norms for its field? Regardless of whether the study has received ethical approval, have the authors adequately considered any ethical risks of the research?_

That seems to be for the ethics committee to decide, who have given a positive vote. However, for me, some concerns remain regarding stimulus intensities. These are possibly device-related (non-)issues, but important enough to emphasize: The temperatures mentioned are very high. If tissue actually reached such temperatures, burns would be inevitable in the timeframes employed, especially considering the flat addition of 3°C (safe exposure duration decays exponentially by intensity, cf. Moritz & Henriques, Am J Pathol 1947); actual temperatures of 50°C would be excruciating for 90+% of people. The authors have piloted the procedure and it seems to work, but I would suggest they figure out the discrepancy to established temperature ranges.

Reviewed by ORCID_LOGO, 09 Oct 2023

The proposed study aims to investigate the effects of stimulus intensity and saliency on modulations of pain-related neural oscillations. To this end, the authors propose a paradigm in which sustained periodic heat stimuli are applied to 30 healthy human participants. In an oddball paradigm, deviant stimuli of higher or lower intensity will be interspersed. Differences in amplitudes of oscillations between high and low oddball stimuli would indicate that oscillations are influenced by stimulus intensity. In contrast, a lack of a difference would be taken as evidence for an effect of saliency on oscillations. 
The study is well-planned, and the manuscript is mostly clear and convincing. However, the framework, the paradigm, and the analysis might benefit from modifications and added details.
1.     Framework. The framework and the analysis propose a binary view of stimulus intensity and saliency effects on pain-related brain activity. Brain activity is either influenced by stimulus intensity or saliency. However, considering previous evidence for saliency and stimulus intensity effects on pain-related brain activity, a mixture of both effects appears possible, if not likely. So far, a difference between high and low oddball stimuli will be interpreted as a stimulus intensity effect but does not provide any information about possible additional saliency effects. In contrast, a lack of a difference between high and low oddball stimuli will be interpreted as an exclusive saliency effect. Thus, in the likely case of a difference between high and low oddball stimuli, the study's outcome would be evidence for a stimulus intensity effect but essentially no information about saliency effects. The authors might fundamentally re-consider their framework and analysis so that it can account for and quantify non-binary effects of saliency and stimulus intensity.  
2.     Paradigm. The paradigm includes a sustained periodic heat stimulation with high and low oddball stimuli. However, high and low oddball stimuli do not only differ concerning stimulus intensity but also concerning painfulness. The high oddball stimulus is always painful, whereas the low oddball stimulus is always non-painful (Supplementary Figures 1 and 2). Thus, the quantitative difference in saliency between both conditions is confounded by a qualitative difference between painful and non-painful stimuli. This should be carefully considered and accounted for. The authors might consider adjusting the paradigm so that all stimuli are painful.  
3.     Writing. The general reader might need to become more familiar with the frequency tagging approach and the underlying logic. A more straightforward explanation of the approach with an intuitive figure showing what will be analyzed and termed “phase-locked response” and what will be called “modulation of ongoing oscillations” would be helpful.  
4.     Analysis of behavioral effects. The analysis window for pain ratings is 2 sec after the stimulation peaks. However, the pilot experiment has revealed that pain ratings in the low oddball condition peaked after 2.33 sec. Thus, an adjustment of the analysis window might be appropriate.
5.     Analysis. The definition of the frequency bands does not adhere to the COBIDAS recommendations (Pernet et al., Nat Neurosci, 2020). This should be corrected.
6.     P.15, second paragraph. In the second last line, “high oddball condition” might have been confounded with “low oddball condition.”     

User comments

No user comments yet