Putting climate action intervention to the test: Part 1

ORCID_LOGO based on reviews by Helen Landmann, Jana Kesenheimer and 1 anonymous reviewer
A recommendation of:

A climate action intervention to boost individual and collective climate mitigation behaviors in young adults


Submission: posted 11 January 2024
Recommendation: posted 29 April 2024, validated 30 April 2024
Cite this recommendation as:
Chambers, C. (2024) Putting climate action intervention to the test: Part 1. Peer Community in Registered Reports, .


It is increasingly recognised that resolving the climate crisis will require not only the reform of law and government policy but collective grassroots action to change individual behaviour and put public pressure on political leaders, companies and institutions to cut emissions. The capacity, however, for individual citizens to take such steps is limited by lack of knowledge/awareness of means and opportunities as well as psychological barriers that can make such actions seem impossible, fruitless or against the person's immediate self-interest. Interventions designed to overcome these obstacles and promote individual behaviour change have met with only limited success, with many based on weak psychological evidence and the outcome measures used to evaluate their success prone to error and bias.
In the current submission, Castiglione et al. (2024) propose a series of five studies to test, evaluate, and optimise a longitudinal intervention for engaging young adults (aged 18-35) in individual and collective climate action. Building on existing theory and evidence, the authors have designed an intensive 6-week educational intervention that draws on 12 psychological factors linked to pro-environmental behaviour, including emotional engagement, self-efficacy, collective efficacy, theory of change, cognitive alternatives, perceived behavioral control, implementation intentions, social norms, self-identity, collective identity, appraisal, and faith in institutions. Through the use of ecological momentary assessment (EMA), they plan to measure these targeted psychological correlates as well as individual and collective climate engagement of participants before and after the intervention (and in active groups vs. controls), and then again after a further three months.
The current submission is novel in being the first at PCI RR (and possibly the first RR anywhere) to propose an incremental programmatic workflow that combines two innovations: a single Stage 1 protocol leading to multiple Stage 2 outputs (under the PCI RR programmatic track) and a prespecification in which the design of the intervention in later studies is (for now) determined only broadly, with specific parameters to be shaped by the results of the first set of studies (under the PCI RR incremental registrations policy). This particular Stage 1 manuscript specifies the design of study 1 in two samples (high-school and university students in Italy; producing one Stage 2 output for each sample) and the general design of subsequent studies. The details of this later research in study 2 (in the same two populations) and study 3 (university students in the Netherlands) will be developed sequentially based on the results of the previous Stage 2 outputs and the state of the literature at that time.
The Stage 1 manuscript was evaluated over two rounds of in-depth review. Based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA). Following the completion of study 1, the authors will submit an updated Stage 1 manuscript for re-evaluation that updates the plans for later studies accordingly, hence the current recommendation is labelled "Part 1".
URL to the preregistered Stage 1 protocol: (under temporary private embargo)
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.
List of eligible PCI RR-friendly journals:
Castiglione, A., Brick, C., Esposito, G., & Bizzego, A. (2024). A climate action intervention to boost individual and collective climate mitigation behaviors in young adults. In principle acceptance of Version 3 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #2

DOI or URL of the report:

Version of the report: 1 (Revised Stage1_ProgrammaticPCI-RR_A climate action intervention to boost individual and collective climate mitigation behaviors in young adults.dox)

Author's Reply, 18 Apr 2024

Decision by ORCID_LOGO, posted 08 Apr 2024, validated 09 Apr 2024

Two of the three reviewers from the previous round were available to evaluate your revised submission. The third reviewer was available in principle (and in the event of IPA may return at a later point) but could not provide a re-review within a short time-frame, so given the time constraints you are facing I decided to press ahead with a decision based on these reviews.

As you will see, Helen Landmann makes (I think) an excellent point about the recruitment methodology -- one that I know you are also aware of, but please consider it carefully again. I will make a final Stage 1 decision following your response to this point (without going back to the reviewers).

Reviewed by , 31 Mar 2024

Thank you very much for this excellent revision that addressed all points that I mentioned. I have just one major concern left: In the revised version, it becomes clear that the recruiting strategy is different for the experimental group and the control group. “The experimental group will be recruited to participate in six hour meetings on climate education and fill online questionnaires, while the control group only to complete the questionnaires due to the difficulty of recruiting all 120 participants for the in-person meetings.” This would make it very difficult to infer causal effects from the study. Every difference between the two conditions could be attributed to the different recruiting strategy and the different motivation that participants had from the start. The authors describe that they would use initial motivation as a control variable in their analyses. However, this only partly addresses the problem. The study requires a lot of time and effort. It would be a pity if causal inferences would be severely restricted by the different sampling strategies. I would like to motivate the authors to consider a waitlist control group that receives the treatment later than the experimental group. With this design all participants would be recruited with the same information. I’m aware that it is more difficult to recruit participants for the in-person meetings than to just fill in online questionnaires. However, you planned five different studies with the above mentioned design. I would suggest to reduce the number of studies but to choose the more robust design.

Reviewed by ORCID_LOGO, 19 Mar 2024

The authors have addressed my comments (and also those of the other reviewers) thoroughly and appropriately. Especially the idea of the coding list I find to be a valuable addition! I appreciate this work for the numerous additions and endorse the publication of the registered report. I am eager to see the results and wish the researchers success in recruiting participants and carrying out their project! 

Evaluation round #1

DOI or URL of the report:

Version of the report: 1

Author's Reply, 15 Mar 2024

Decision by ORCID_LOGO, posted 06 Mar 2024, validated 06 Mar 2024

I have now obtained three very helpful and constructive reviews of your programmatic Stage 1 submission. Based on my own reading and the enclosed specialist evaluations, I believe the proposal is a promising candidate for eventual in-principle acceptance, but there are a number of major design issues to iron out. Without providing an exhaustive list, major issues have arisen concerning the specificity of the hypotheses, sample size, dropout rates (and the response to dropouts), suitability and efficiency of the analysis plan, and potential interpretration (and interpretability) of different outcomes. While these will require careful consideration in revision, I don't see any one of these issues as a roadblock -- given the incremental and somewhat complex nature of the design, they strike me as somewhat inevitable yet just as important to resolve satisfactorily.

I look forward to receiving your revised manuscript in due course.

Reviewed by , 11 Feb 2024

The registered report „A climate action intervention to boost individual and collective climate mitigation behaviors in young adults” proposes a set of five intervention studies with control groups and three measurement time points. I highly appreciate the attempt to investigate the causal effects of pro-environmental interventions in longitudinal studies. The design allows for pre-post comparisons, comparisons with a control group and the investigation of effects three months after the intervention. It is also fortunate that the studies will take place in different countries (Italy and the Netherlands). The study plan requires much effort and time. Such intervention studies are highly practically relevant but so far rare.

In the summary of the psychological obstacles of pro-environmental behavior I missed the point that most pro-environmental behaviors are embedded in a structure of a social dilemma (the behavior that provides the largest short term benefits for the individual is different from the behavior that benefits the collective in the long run; see Claessens et al., 2022; Steg et al., 2014).

As correlates of individual and collective pro-environmental behavior, you mention negative emotions, individual and collective efficacy, environmental identity, social norms, and cognitive alternatives. You may want to consider the role of positive emotions and activist identity in addition (see Landmann & Neumann, 2023).

The distribution of studies in Italy and the Netherlands is a bit one-sided. Only one study is planned in the Netherlands, the other four in Italy. I could not find an argument for why this isn’t more balanced.

I would prefer more information about the expected drop out rate - do you expect drop out only for the experimental group, not for the control group? I think you should calculate some drop outs in the control condition as well.

Please clarify whether participants are randomly assigned to the experimental or the control condition or whether they decide themselves if they want to participate in the intervention. You write that participants will receive a certificate for participating in the intervention. What about those in the control group? Do they have the same incentive for participation (including the certificate) or is the appeal for participation for those in the control group different?

I disagree that the comparison with the control group completely cancels the possible effect of social desirability. The perceived expectation of reporting pro-environmental behavior might be higher in the experimental condition, in which participants are repeatedly confronted with environmental issues. However, assessing pro-environmental behavior by self-report is still suitable for the planned studies. If people are asked very specific questions about their behavior as in the EMA, it is difficult to lie. Thus, I think the self-report measures of pro-environmental behavior are fine, I just suggest to discuss its limitations differently.

The mixed effects linear model seems suitable to test H1. I wondered why you chose a different analyses for testing H2 (repeated measures ANOVA) – mixed effects would be also suitable here.

For testing whether the effects of the intervention differs between the studies (H4) you need to add condition as predictor as well as its interactions with study ID. This would make your planned regression analyses very complex. It may therefore be more convenient to test H4 with mixed effects models including condition and study ID as predictors.

I’m already looking forward to seeing the results!

Claessens, S., Kelly, D., Sibley, C. G., Chaudhuri, A., & Atkinson, Q. D. (2022). Cooperative phenotype predicts climate change belief and pro-environmental behaviour. Scientific Reports, 12(1), 12730.

Landmann, H., & Naumann, J. (2023). Being positively moved by climate protest predicts peaceful collective action. Global Environmental Psychology.

Steg, L., Bolderdijk, J. W., Keizer, K., & Perlaviciute, G. (2014). An integrated framework for encouraging pro-environmental behaviour: The role of values, situational factors and goals. Journal of Environmental Psychology, 38, 104-115.

Reviewed by ORCID_LOGO, 23 Jan 2024

Dear Anna and greetings to the entire research team,

I had the pleasure of reviewing your registered report in which you present a comprehensive long-term study involving approximately 50 participants undergoing an intervention, with effects compared to a control group of around 50 individuals. I believe the described approach is a valuable initiative to confront interested (yet not actively engaged) students and other young individuals with climate change, providing them with motivation for action. The theory-guided intervention based on the 12 factors is particularly promising. Additionally, the ideas for analyses considering interaction effects over time involving group, time, and person (as a random effect) levels are sensible.

While reading, some concerns and ideas arose, which I will describe below:

·         If I understand correctly, the control group is keeping diaries (EMS) for a total of 12 weeks (3 months). A 30% drop-out rate seems optimistic, especially given that there is a 3/100 chance (roughly) of winning €150 at the end. It might be worthwhile to research dropout rates from other environmentally focused diary methods, and adapt this rate, if necessary.

·         EMS has a significant drawback not yet mentioned: participants can mention anything, which, while acknowledged as an advantage, may lead to large subjective distortions when only frequency is queried. Personal biases (personality, attitudes, etc.) might not be reflected, potentially obscuring the results. For example, we conducted a study in which environmentally less conscious individuals reported partly mundane activities (e.g., using a lid while cooking to save energy), while others understood more elaborate behaviors (e.g., refraining from a flight and opting for a several-hour train journey): If both are equally weighted in terms of frequency, personal biases (personality, attitude, etc.) are not reflected, which could obscure the results. Honestly, I'm not sure how you could address this issue, but I wanted to inform you that such distortion could occur in EMS.

·         The effort required for participants seems substantial. Likely, only individuals already living environmentally friendly lives will persist, raising questions about the variance left for improving their behaviors – as you accordingly suggested in the report. Considering the anticipated ceiling effects and drop-out rates, I personally find the sampling size to be a bit small. Additionally, the power analysis mentions 14 replicates, while a minimum of 4 replicates is discussed on page 15. Should this number be reduced in the sampling accordingly?

·         It would be great to learn more about the sampling of the study you described on pages 2-3 (e.g., were the participants also young, and how many were there?).

·         Participants might be aware of what you are investigating, leading to biased responses in line with the expectation that the sessions should have an effect. How do you plan to account for this bias?

·         The focus on negative affects regarding climate change (anxiety) is noted. It would be beneficial if you could elaborate on why positive emotions like hope are excluded. Another crucial negative emotion seems to be anger, which doesn't paralyze but activates (see:

·         One last thought: considering your assumption that planning on one-day influences implementation on another, incorporating AR1 models might be beneficial.

Best regards and good luck with your research! I am looking forward to follow this research progress.


Reviewed by anonymous reviewer 1, 06 Mar 2024

Thank you for the opportunity to review this programmatic registered report. The goals of this work are timely and important.

Major feedback:

The overarching hypotheses are sensible but not specific. I understand that this may be challenging given the iterative nature of the design but given the larger number of outcomes and psychological factors, it would be helpful to know more about: 1) how hypotheses will be refined over studies (e.g., if some outcomes but not others are altered in study 1, will this prior information be incorporated in some way in study 2?) and 2) how the false positive rate will be minimized across outcomes and moderators. 

This program of research has an implicit causal model—in which the intervention should increase the targeted psychological mechanisms, which in turn should change behavior—so it is unclear why methods for testing causal pathways are not actually proposed. In its present form, H1 maps onto the c path (X --> Y), H2 maps onto a path (X --> M), and H3 contains a combination of the b path (M --> Y) and the indirect path (X --> M --> Y). Rather than testing these hypotheses in separate models, it would be more parsimonious and informative (i.e., to actually test the implied indirect paths) to test these hypotheses in a single model path model. There are various methods to do this (see e.g. and extensions that could be used for the EMA data through Bayesian multilevel modeling (in the brms package; see e.g. If a Bayesian framework were employed, this would also have the advantage of enabling the priors to be updated from one study to the next based on the results.

Regarding sample size and power, I am concerned that the estimated effect sizes for H2 and H3 are not realistic and will result in these tests being underpowered. Recent studies have shown that average published effects tend to be much higher than replication or preregistered effects ( ; In the former study for example, preregistered between-person experiments had a median effect size of just r = .12 versus r = .34 for non-preregistered studies. If power is not increased, this could also negatively affect the iterative nature of the research program. That is, observing a false negative might prompt changes to the intervention that are unnecessary or detrimental, thus affecting subsequent studies. One approach for overcoming this issue would be to power study 1 to detect smaller effect sizes (similar to H1) to get a realistic sense of the effect size for this intervention, and then power subsequent studies accordingly.

Regarding attrition, the authors state that they expect ~30% attrition, but as far as I could tell the target sample sizes do not include additional participants to make up for this. Given the very high level of engagement required for this study (and relatively low compensation), it seems likely that attrition will be high. It may also be wise to preemptively plan for even greater attrition in study 1 and then adjust as needed in subsequent studies.   

Another concern is the combination of iterative improvement and assessing generalizability different developmental and cultural samples. Each of these goals are useful, but when used in combination, they may make the result uninterpretable. That is, the authors expect that each iteration will improve the efficacy of the intervention (H4), but that may not be the case for more distal populations from the ones it was developed on. Thus, the final iteration of the intervention may not be the best because it simply works less well on high school students than it does for university students. Similarly, making changes in study 4 based on how well it performs in young professionals may result in a worse intervention for high school students who are in a different phase of life and have different affordances. An alternative approach would frame studies 3-5 as tests of generalizability and not use iterative improvement. 

Relatedly, the current design does not include many individual difference measures or have any planned moderation analyses. Although the hypotheses are centered around average effects between groups, there will likely be a lot of heterogeneity in how effective the interventions are for individuals. Including a broader array of individual difference measures may help contextualize for whom the intervention is effective (and explain attrition) and generate moderation hypotheses that could be tested in subsequent studies.  

Other suggestions that may enhance this work:

  •  Include measures of hope and resilience/well-being in addition to climate anxiety

Other questions that arose while reviewing:

  • The modules have solid coverage in topics but it is noteworthy that there is no focus on developing strategies for resilience and hope amidst the climate crisis. This seems to be an essential ingredient that buffers and enables people to continue to engage in climate action, particularly as they learn more and may feel anxiety, grief, hopelessness etc (see e.g. 
  • How many people will be in each intervention module group? If there are multiple groups, how will intervention fidelity of the module be tested (e.g. between different leaders)?
  • What happens if the intervention does not affect the psychological targets or change the behaviors, and it needs to be drastically overhauled? The current model suggests small incremental improvements but in the absence of pilot data testing the intervention modules (e.g. with manipulation checks to ensure they’re targeting the processes intended), it seems possible that deeper changes may need to be made. 
  • The current set of collective actions focuses primarily on traditional political targets, but what about collective actions in an individual’s more local sphere (e.g., organizing to change the food served at the school cafeteria to reduce climate impact, working with neighbors to plant trees)? These may be more accessible and potentially increase self-efficacy since the effects of their actions are more proximal.
  • What happens if individuals do not plan any actions? Will their percentage scores be treat as NAs or 0s? And what happens if individuals engaged in unplanned actions? Will they be able to report those and will they be counted toward their action scores? 
  • What is the justification for the exclusion thresholds listed? If possible, using methods that include all available data but adjust for differences between estimates of individuals with more versus less data (e.g. shrinkage in multilevel modeling) may reduce bias in the estimates.

User comments

No user comments yet