Close printable page

Are dreams important for memory consolidation?

ORCID_LOGO based on reviews by 1 anonymous reviewer
A recommendation of:

The relationship of memory consolidation with task incorporations into dreams – A registered report


Submission: posted 23 March 2022
Recommendation: posted 23 January 2023, validated 23 January 2023
Cite this recommendation as:
Chambers, C. (2023) Are dreams important for memory consolidation?. Peer Community in Registered Reports, .


Sleep is known to be crucial for human memory, but what about dreams? Previous research has shown that the content of dreams can be manipulated by specific stimuli or tasks prior to sleep, but whether incorporating tasks into dreams influences memory consolidation is less clear. Some studies have shown an association between incorporating memory tasks into dreams and later memory performance, while others show either no effect or weaker effects. Potential reasons for this variation include the targeting of different stages of sleep – including rapid eye moment (REM) and non-REM stages (NREM) – small sample sizes, and the fact that many previous studies do not employ declarative memory tasks, which have been found to benefit more from sleep compared with tasks that target procedural memory.
In the current study. Schoch et al. (2023) ask whether dreams are an epiphenomenon of sleep-dependent memory processing or, instead, whether they play a key role in memory consolidation – and if so, whether that role differs for subjective experiences during NREM and REM sleep stages. Using a declarative memory task, a serial awakening paradigm (in which participants are woken and tested during NREM or REM stages), and targeted memory reactivation (TMR), the authors will test two main hypotheses: that incorporating picture categories of a declarative memory task leads to immediate (next morning) and sustained (4 days later) improvement in memory performance (especially for NREM dreams); and second, whether TMR influences the reported content of dreams. The authors also build in a range of control analyses to confirm that the task was incorporated successfully into dreams and that TMR benefited memory performance.
The Stage 1 manuscript was evaluated over two rounds of in-depth review, initially at Nature Communications before being transferred to PCI RR for further evaluation (see review history below for details). Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.
List of eligible PCI RR-friendly journals:
1. Schoch, S. F., Ataei, S., Salvesen, L., Schredl, M., Windt, J., Bernadi, G., Rasch, B., Axmacher, N., & Desler, M. (2023). The relationship of memory consolidation with task incorporations into dreams – A registered report, in principle acceptance of Version 3 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.


Evaluation round #2

DOI or URL of the report:

Version of the report: v2

Author's Reply, 20 Jan 2023

Decision by ORCID_LOGO, posted 03 Jan 2023, validated 04 Jan 2023

Thank you for the extremely thorough revision, which addresses all of the points raised in my previous decision letter. One of the original reviewers notes a couple of minor points to address in a final Stage 1 revision and response. We should then be in a position to award in-principle acceptance without further in-depth Stage 1 review.

Reviewed by anonymous reviewer 1, 02 Dec 2022

The authors have done an excellent job with this revision. I only have two comments / suggestions:
Lines 135-6: “Hypothesis 2) TMR leads to the subsequent incorporation of the associated 135 image categories into dreams during both NREM and REM sleep stages.” Is there a memory prediction here? Will TMR-induced dream incorporation benefit memory in both NREM and REM? This is slightly different than the control analysis, in that dream incorporation may actually predict memory consolidation. If the authors do not have a strong hypothesis about whether dream incorporation in one or both stages would affect memory, they could also mention this as another planned analysis later (apologies if I missed this if they did).

Regarding waking subjects up 10-30 s after TMR, I imagine this range is to allow for some meaningful variability? If not, could the authors please explain this rationale? Also, I suggest that the authors log this information and include it as a predictor in their H2 multilevel generalized model. It could have a strong effect on dream incorporation, which would be instructive for later dream induction efforts.

Looking forward to seeing the results, and I would be happy to review again at that time.

Evaluation round #1

DOI or URL of the report:

Author's Reply, 30 Nov 2022

Decision by ORCID_LOGO, posted 24 Mar 2022

Thank you for submitting your Stage 1 manuscript as part of an agreed transfer from Nature Communications. As you know, Nature Communications and the three anonymous reviewers who evaluated your initial submission have consented to this transfer following the editor's decision to reject your manuscript. From this point forward, PCI RR will assume control of the review process.

As you will see, the reviews are broadly positive about your submission but note a number of areas requiring revision in order to meet the PCI RR Stage 1 criteria. The most substantial areas of concern are justification of a range of study design characteristics as well as the need for additional detail and clarification regarding the analysis plans.

Based on my own reading, I also have a number of comments that I would like you to consider in revision. I include these below under the section "Recommender Comments".

Please respond comprehensively to all of the comments below in a revised manuscript and a point-by-point response to me and the reviewers. I will then return the manuscript to the three reviewers for re-evaluation.


Recommender Comments

1. For a RR, it is crucial to ensure an exact alignment between the power analysis and statistical methods that will test the hypotheses. Power analysis methods are available for LMEs (e.g. here) or should be reported using simulations. It was unclear to me what approach you had taken with the power analysis and which model(s) it applied to within the LMEs. The power analysis must be calibrated to be sufficient to meet PCI RR requirements for the most conservative hypothesis test among the set.

2. Rather than sampling 15% over the required sample size, commit to collecting the minimum required sample size to achieve the desired power, regardless of the exclusion rate. This will provide certainty concerning the sample size included in the analysis.

3. "Outliers will be inspected, but not removed unless there is a reason to believe they are due to measurement error. However, to ensure robustness of the results, models will be repeated with outliers (> 3 SD) removed." I found this a little vague. At what level of granularity (i.e. what cells in the design) will this rule of >3 SDs be calculated and applied? Will statistical outcomes with or without outliers determine the conclusions? How will the conclusions will adjusted if outlier removal alters support for the hypotheses? These contingencies need to defined precisely to eliminate risk of bias.

4. "If the data distribution of the residuals is non-normal, we will examine if a gamma distribution is a better fit. If problems persist, data will be transformed with a logtransform. Missing data will be estimated using full maximum likelihood." Define the precise conditions for concluding non-normality and the method used to determine it. Define precise contingencies under which different transformations will or will not be applied and corresponding downstream effects on analysis plans (if any).

5. “…due to bad sleep quality or problems with data recording.” Provide an objective set of criteria for determining that data is “bad quality”.

6. A design table is included but please adapt this it to the PCI RR version, which has additional requirements (see here). I struggled to follow the structure and logic of the Control analysis section within the design table. Ensure that there is a one-to-one correspondence between statistical tests and interpretrations for all hypotheses (including the Control analysis section). If you predict certain relationships for the control analyses these should be include in the hypothesis column for that row. Perhaps insert a Variables column to the design table to keep the Analysis Plan column focused on the model(s) that will be tested. Alternatively, rather than adding a Variables column, move the description of variables out of the design table and into the Method section in the main text.

7. Hypothesis 1 is multi-pronged (with multiple sub-hypotheses) and the testing of Hypothesis 3 is contingent on support for Hypothesis 1. This makes it especially crucial to define the precise conditions under which Hypothesis 1 will be considered to be confirmed or disconfirmed based on combinations of outcomes for each of its sub-hypotheses. This needs to be made crystal clear.

8. The design table splits the hypotheses into sub-hypotheses (which is good) so please do the same in the introduction. I suggest listing them at the end of the Introduction in bullet point form to achieve maximum clarity.

9. Use of Bayes factors needs to be substantially elaborated, making clear the priors and any other relevant parameters, and the precise conditions under which Bayes factors will be reported. Bayes factors are mentioned in the study design table but, unless I missed it, are not mentioned anywhere else in the manuscript.

10. The Statistical Analysis section needs to be greatly expanded to provide comprehrensive detail, rather than relying entirely on the design table (this is also noted by the reviewers).

11. The exclusion criteria are quite complex so please add a CONSORT-style diagram to illustrate the rules under which data / participants will be excluded at different stages in the data acquisition and analysis.


Reviewer #1

This manuscript presents the authors' proposal for conducting a study that specifically examines the extent to which task information is incorporated into dreams and how that relates to subsequent memory performance. This is an interesting question, and previous studies investigating this question have uncovered conflicting evidence. This is also a difficult question to address since the contents of dreams are difficult to probe.

In this design, the authors propose to have subjects perform a simple visual association task prior to sleep. Then during sleep, at four points during NREM and also four points during REM, the experimenters will wake the subject and ask them to report their dreams. This will address the question of whether task information is incorporated into dreams, and subsequently how much this correlates with memory performance. In a second version of this paradigm, the experimenters will introduce an audio recording of words presented in the task during sleep prior to waking(targeted memory reactivation) in order to assess whether this manipulation will increase the likelihood of incorporating task information in the dreams, and the subsequent effects on memory. An additional question to be pursued here is whether the incorporation of task elements during REM sleep change the emotional valence of those images.

This is a well designed study and an interesting question, and it would potentially provide a valuable result to the community of researchers studying the relation between dreams and memory. This is also, however, a challenging study largely because of the challenges inherent in probing the contents of dreams. This entirely relies upon the subjects ability to recall and report details of their dreams. The task design used here (training on reporting prior to the experiment and the experimental paradigm itself) partly addresses those limitations, and so there is little more that can be done about this. One strength of the design is the relatively large numbers of participants that are expected to be enrolled. This could help mitigate some of the concerns regarding the data that are being captured.

Another issue is that the images and words used in the memory task are drawn from a large dataset and are likely very common, which makes it unclear whether the contents of a dream are explicitly related to the task, or just coincidentally overlap with some elements of the task (for example, if the task contains a picture of a dog, and you dream of a dog, is it because of the task, or is it because you happen to be dreaming about a dog). This could be partly addressed by using only very unique and rare images or items for the memory task.


Reviewer #2

The authors propose a study investigating the relationship between dream content, memory retention, and emotional processing. They do a nice job of covering the literature (though see a few notes below), the planned data collection seems sufficient to address their questions, and the planned analysis are thorough and clear. The proposed study is interesting and does not have any obvious flaws (provided some of the answers to the below concerns are reasonable), though I do question whether the study’s importance, even if all hypothesized results are obtained, warrants publication in Nature Communications. That said, I will leave that up to the editor to decide and I offer my comments/concerns below.


Do the authors have a hypothesis regarding whether TMR during REM sleep (or TMR in general, since it will be applied during both SWS and REM on the same night) will further lower emotional and arousal ratings beyond ratings given on the spontaneous night? It seems reasonable to at least consider this for a planned post-hoc test, if there is no hypothesis.

Are there predictions regarding the follow-up memory recall test?

Are there any physiological predictions, such as correlations between stages or hallmarks of sleep like slow oscillations and spindles?


2nd paragraph: Oudiette et al. (2011) is a relevant study here. 3rd paragraph: “less” -> “fewer”

Final paragraph: Konkoly et al. (2021) is highly relevant here, as is Horowitz et al. (2020).


Please define EGG for the readers in the main text.

“In experimental session A, participants will be woken up a maximum of four times from NREM and four times from REM sleep, 15 minutes into each sleep stage. A free dream report for the last minute of sleep will be elicited during each awakening, followed by ratings on several scales.” Why 15 minutes? What happens if this stage is broken up by another sleep stage (e.g., 8 minutes of REM, 2 minutes of stage-2, and then more REM)? I would advise the authors to lower this number unless they have a strong justification for it, considering the frustrations that will likely ensue if they enforce it strictly.

“The words will be presented for approximately 10 minutes before each awakening.” How long will the authors wait between the final word presentation words and awakening? This is a critical detail and could very well influence their results. Additionally, do the authors have a hypothesis regarding whether dreams will be more likely for more recent (vs. less recent) associated memories?

“maixmally” -> “maximally”

“108 healthy male” -> “One hundred and eight…” Please do not start a sentence with a number. “When the participant is lying in bed, we will do a resting-state EEG measurement (1.5 min

eyes open, 1.5 min eyes closed, 1.5 min eyes open, 1.5 min eyes closed).” Can the authors

please explain the rationale for this? Is there a hypothesis linked to this measurement?

“In both experimental nights, participants will be instructed to signal if they have a period of lucidity. Dream reports with lucidity will be removed from analyses (score >= 4 on the lucidity scale).” Can the authors also explain the rationale for this exclusion practice?


What will authors do if control analyses do not go as planned e.g., H1a model is supported but, dream length is significant?

Why these interactions? “NREM_inc_cor:Timepoint +REM_inc_cor:Timepoint” Aren’t the dream incorporations (and all related sleep variables) going to be the same, as attributed to both evening and morning? Please clarify if I’ve simply misunderstood, as it will perhaps be unclear to other readers too.

What is different in the two main H1b, “Check if decrease across time is significantly dependent on valence” sections?


Reviewer #3

The authors propose a study to assess the role of dreaming for memory consolidation. This issue is of great significance for the cognitive neuroscience community. Despite numerous studies on the interplay of sleep and memory, this important research question could still not be satisfactory answered.

Overall, the proposed study design is suitable to investigate if memory consolidation actually benefits from dreaming. The introduction provides a comprehensive overview of the relevant literature and the proposed hypotheses are plausible. Furthermore, the proposed methodology and analysis pipelines are sound and feasible. The authors provide a careful statistical power analysis base on conservative assumptions, finally leading to a relatively large number of 108 participants required for the study.

The methods are described very clearly. Besides my comments below, which need to be addresses prior to the study, the authors provide all necessary detail to prevent undisclosed flexibility in, and to enable exact replication of the proposed experimental procedures and analysis pipelines.

Comments regarding the proposed statistical analysis:
Compared to the degree of detail of the study design description, the analysis paragraph is rather short and contains not enough detail.

For instance, the statement “If the data distribution of the residuals is non-normal, we will examine if a gamma distribution is a better fit. If problems persist, data will be transformed with a logtransform” needs more elaboration. Why is the gamma distribution the second best choice? Which parameters are expected to be normal, gamma, or lognormal distributed? How will the authors test for several distributions?

Also the statement “Missing data will be estimated using full maximum likelihood.” needs further explanation. Why does missing data need to be estimated at all, instead of just being skipped? How exactly will missing data be estimated in an unbiased way and how is it assured that the estimated data does not affect the overall results? In order to prevent undisclosed flexibility in, and to enable exact replication of the analysis pipeline, this issue needs to be addressed.

Comments regarding the (supplemental) methods section, the following issues should be addressed prior to the proposed study:

The authors state that “The adaptation night is scheduled as closely as possible to the first experimental night (the night before the first experimental night, maximally seven nights before)”.
However, I strongly recommend the adaptation night to be always immediately before the first experimental night.

Regarding this procedure: “three trials are conducted where words are presented at increasing sound levels (from 20 dB in 5 dB steps) until the participant shows an arousal.”
- It should be “20 dB SPL”
- How is the speech transduced? Open field or via in-ear head phones? If open field, then what is the distance between the loudspeaker and the participant’s ear? This is an important detail since the sound pressure level (SPL) decreases quadratically with the distance.
- Which words will be presented?

It is planned that “Afterward, they fill out a questionnaire about their sleep … and a question about spontaneous, non-experimenter awakenings).”
Will this be cross-validated using the EEG data? Which purpose does this question serve? Whether participants can recall non-experimental awakenings, or whether they had any at all? Please clarify.

“After 3 minutes of stable NREM and REM sleep”
Which NREM sleep stage exactly? N1, N2, N3 ? I guess N3 would be the most obvious choice, however earlier in the report, the authors mentioned N2. Please clarifiy.

“experimenters will play audio cues for a maximum of 12 minutes at the detected audio threshold using two loudspeakers placed next to the participant’s head.”
What will be the distance between the loudspeaker and the ear? Since the sound pressure level (SPL) decreases quadratically with distance, it is very important that this distance is either always kept constant or recorded for each participant on the night in which the threshold values are determined. This is the only way to ensure that on the second night the distance and thus the sound pressure level is comparable and actually corresponds to the estimated threshold value.

“In both experimental nights, participants will be instructed to signal if they have a period of lucidity. Dream reports with lucidity will be removed from analyses (score >= 4 on the lucidity scale).”
- How will participants signal periods of lucidity? I guess via voluntary eye movements according to a pre-defined code. However, how can it be assured that participants are actually able to do so during lucid periods. Usually, this requires training even for frequent lucid dreamers. Please clarify!
- Please cite literature on lucidity scale.

“The inclusion criteria to participate in the study are … high English language proficiency”.
- How will the proficiency be assessed?
- Wouldn’t it be better to restrict the inclusion criteria to English native speakers? In neurolinguistic studies, it is of great importance whether participants are L1 or L2 learners of a given language. To some extent this also applies to memory consolidation studies.