The authors have addressed all of the comments I raised. My main concern was about exclusion rates, and I can see from the explanation in the response letter and the new information in Table 2 that the majority of the participants I assumed had been excluded actually did not complete the survey. Given that actual exclusion rates were pretty low, I don't think the points I had suggested for discussion in the manuscript are actually necessary and I am happy with the current edits. I also find the new structure of the results section clearer and it is easier to understand which hypotheses are being tested. The additional explanation and figure note on lasso-regularized models is helpful too.
Thanks to the authors for their very responsive and thorough revisions throughout the process.
Title: Self-Control Beyond Inhibition. German Translation and Quality Assessment of the Self-Control Strategy Scale (SCSS)
I think the authors have comprehensively responded to the comments from the editor and the reviewers, which has improved the quality of the manuscript, particularly in terms of readability, transparency, and conclusions.
As with the previous stage, I understand that some of my more detailed comments may be better addressed in a separate manuscript.
Overall, I find the project and the resulting manuscript to be very interesting, comprehensive, transparent, and relevant to self-control research.
DOI or URL of the report: https://doi.org/10.31234/osf.io/gpmnv
Version of the report: 2
Thanks to the authors for a well-written and very interesting Stage 2 report!
2A. Whether the data are able to test the authors’ proposed hypotheses (or answer the proposed research question) by passing the approved outcome-neutral criteria, such as absence of floor and ceiling effects or success of positive controls or other quality checks.
To assess this criterion, I would find it helpful for the authors to provide additional information to explain why so much data was excluded. If I am reading Table 3 correctly, a very large proportion of responses were excluded in Studies 4 and 5 (58% and 53%). I visited the link on p14 to find the number of exclusions by reason and sample (https://osf.io/aup93/), but could not find this information. Could the authors point me to where I can find it on OSF, or add it if it has not yet been uploaded?
The pre-registered exclusion criteria were clear, and were quite stringent regarding responding incorrectly to attention checks, so it does not necessarily seem surprising to me that a lot of data was excluded. However that does seem like a very large amount of unused data, and I am curious about what happened here and what the authors’ interpretations were. For example, I would be interested to hear whether the authors feel confident that all of the excluded data is poor quality, and whether they think there is any chance that the exclusions systematically relate to questionnaire scores (e.g. are people with lower inhibition more likely to miss an attention check?)
Personally I think it would be useful to provide more detail on this in the manuscript as well as in the response letter.
2B. Whether the introduction, rationale and stated hypotheses (where applicable) are the same as the approved Stage 1 submission.
I did not note any problems with this criterion. However one thing I’d note is that the hypotheses in the introduction were very clearly stated, but the results section does not refer specifically to the hypotheses, and the subheadings also don’t use wording which is consistent with the introduction. A clearer correspondence between these two sections would make it easier for readers to follow along as the aims are tested and to understand whether all the hypotheses were supported by the data.
My suggestion would be to use the three aims stated in the Design Table as the subheadings (i.e. internal structure and reliability, convergent and discriminant validity, relationship to self-control outcomes) and underneath these headings to refer specifically to the numbered hypotheses.
Related to this, I was not sure it worked well to start the results section with a section titled ‘Study 1 (Pilot)’. Study 1 is really quite different from the other studies (i.e. methodological development rather than hypothesis testing), and this section doesn’t test any hypotheses (or really present any results), it is more like preliminary work that allows for the other hypotheses to be tested. Perhaps a different/more descriptive subheading would work better to make this clear, or even to move this description of scale development to the method (where the SCSS measure is described).
2C. Whether the authors adhered precisely to the registered study procedures.
I did not note any problems with this criterion.
2D. Where applicable, whether any unregistered exploratory analyses are justified, methodologically sound, and informative.
I generally found these informative and clearly explained. However, it seems to be assumed that the reader will already know what a lasso-regularized network model is and how to interpret it. I did not know these things and would have appreciated more information!
2E. Whether the authors’ conclusions are justified given the evidence.
I thought the conclusions were appropriate. The descriptions of findings seemed accurate and the authors’ interpretations seemed reasonable, and the authors were also careful to make relevant caveats clear, e.g. that data could often be interpreted in multiple ways and that interventions are not necessarily justified based on this data.
In the discussion section, again I thought using wording consistent with the earlier parts of the paper would be helpful, i.e. starting by using the three key aims as subheadings (internal structure and reliability, convergent and discriminant validity, relationship to self-control outcomes) before moving on to other topics which are more exploratory or speculative.
I hope these comments are helpful and am happy to provide clarification if needed.
Eleanor Miles
Summary
Thank you for the opportunity to review this manuscript in full. I read this new version (including the results) with great interest. As noted below, most of my notes center on the quality of the data – most notably, the significant proportion of participants lost in the social media samples. Below I make specific recommendations to further investigate this issue. I then highlight a few points regarding the discussion.
Results
In table 3, it looks like there was a lot of data lost to quality checks. In some studies, the retention rate is as low as 42-53%. This amount of data loss makes me very hesitant about the quality of the data overall (e.g., was there something wrong with the sampling methods? Maybe the study was too long, considering the number of variables assessed?). At the very least, this needs to be explicitly addressed in the discussion and any conclusions should be tempered as a result. I might also suggest doing some investigation as to why this issue with data quality exists (e.g., there appears to be a lot of attention checks, are most people failing the same checks within a single scale? Are people failing one check or many?).
For table 3, it would also be helpful to have a row indicating the type of sample (e.g., prolific, student, etc.) used in each study. This way this information is easily accessible when comparing metrics across samples.
With the above point in mind, I went back and saw that the samples with the lowest data quality seem to be those collected through social media. While I think that makes a little bit more intuitive sense, I think finding some reference regarding social media data quality is imperative here (e.g., is this amount of data loss similar to what other studies experience?).
Finally, considering the issues with data quality, I would also recommend (a) re-running any pooled sample analyses without those samples, and (b) re-run any quality assessments (e.g., such as those listed in Table 6) for individual samples. This would serve as a bit of a sanity check if the findings remain the same without these seemingly problematic samples.
Discussion
Regarding the lack of an association between strategies and habits. I’m not as deeply entrenched in the habits literature, but this really interesting paper on how strategies and habits can be synergistic may be worth considering for the discussion (see citation below). Generally speaking, the question of whether strategies predict habits is interesting, though one that is not super obvious to me (and certainly not for construct validation) and certainly one that does not have strong empirical support, to my knowledge. However, this recent paper talks about how habits and strategies are often treated as different processes, but that strategies can be used to support more complex habits. Perhaps this newer framing regarding the interplay between strategies and habits might be useful for the authors’ discussion.
Given that the field is shifting its emphasis from what strategies are “good” vs. “bad”, I am rather cautious about some of the conclusions using this language in the discussion. The SCSS is designed to capture general/habitual strategy use, but it is most likely the case that some strategies are good in certain situations and may backfire in others - while we can start to capture this at the habitual level when assessing moderators (which is currently beyond the scope of the current study), I suspect these distinctions are more likely to emerge when capturing actual strategy use (e.g., when actively managing a self-control episode in-the-moment). Perhaps some discussion around this would help give greater context to the current findings (see one of the original conceptual pieces by Bonanno & Burton (2013), some discussion of this issue focusing specifically on reappraisal by Ford & Troy (2019), or a recently adapted version by Werner & Ford (2023) that bridges some of those concepts to self-control).
Smaller Comments
Title: Self-Control Beyond Inhibition. German Translation and Quality Assessment of the Self-Control Strategies Scale (SCSS).
I have read the Stage 2 Registered Report “Self-Control Beyond Inhibition. German Translation and Quality Assessment of the Self-Control Strategies Scale (SCSS)” with great interest and found it to be a well-structured, thorough, transparent, and insightful manuscript. I applaud the authors for undertaking the effort to go through the registered report process for this ambitious project. As I have assumed at Stage 1, this manuscript offers important insight into self-control, the use of different strategies, and their association with a variety of outcomes that goes well beyond validating a translation of the SCSS (which in itself would have been a valuable contribution).
The changes made from the final Stage 1 manuscript to the current version are perfectly acceptable to me, as they were either additions made to the sections that could not be filled out before conducting the studies, or they are small changes to grammar etc., that are, of course, completely fine. I found no noteworthy instances where the authors deviated from what was previously stated at Stage 1 that might somehow distort the findings, and the exploratory analyses seem appropriate to me. Furthermore, the supplemental material on the OSF is very detailed and thorough.
Despite liking the manuscript very much in general, I do have some points I want to discuss, which mainly relate to the interpretation and discussion of the findings.
Major points
1. I do have some concerns with Behavioral Inhibition (BI) and that the discussion now portrays this as a highly effective strategy (e.g., on p. 36 “behavioral inhibition still related to the highest number of outcomes […] This is good news because previous research put a strong emphasis on behavioral inhibition. Our results show that this focus is not unwarranted.”). While this is backed by the results, I do interpret them somewhat differently. I do have ever increasing doubts that one can actually call BI a strategy (also considering the items used in the SCSS) and think it should better be understood as an outcome (see also Werner, Inzlicht, et al., 2022). For example, a person might respond to the item “I find it easy to keep myself from acting on unwanted desires” with high levels of agreement (indicating high levels of BI) not only because they are better at resisting desires through sheer willpower, but because they might have successfully used any of the strategies before this point that might have downregulated the level of desire they feel in the moment. Therefore, to some extent, this “strategy” might show to be highly effective because successfully using any of the preceding strategies is assessed through this “strategy” as well. BI would then not only be conflated with the use of previous strategies but with the at least partially successful use of them (perhaps because they were implemented context-sensitively). Additionally, some people might in general perceive certain desires to be less tempting, and such individual differences might play into this as well. The issue that BI might measure an outcome and not a process is furthermore supported by the finding that BI was related to the BSCS strongly (with .74*** and well above any other strategy, with the second highest only showing a correlation of .35*** with the BSCS). This is noteworthy because for the BSCS, similar concerns have been raised, i.e., that it might measure an outcome (i.e., successful self-control) rather than the underlying process of how this outcome was achieved (see Bürgler et a al., 2022). I furthermore found it fascinating that BI was significantly positively associated with habit strength across all behaviors and timepoints. Quite often it was the only, or one of few, strategies that showed significant associations with habit strength and in all cases it showed the strongest. Habitual behavior is enacted automatically, with minimal cognitive effort, awareness, and control (Gardner & Rebar, 2019). Therefore, theory would strongly suggest that BI, if it actually measures the degree to which the “strategy” effortful behavioral inhibition is used, should be negatively associated with habit strength. If this is not the case, it might be a sign that the subscale BI measures something it is not supposed to measure (e.g., an outcome rather than a process). Subsequently, it might be appropriate to temper some of the wording when discussing the efficacy of this strategy and to discuss some of these potential issues and open questions.
2. Regarding strategy repertoire, the authors wrote “This indicates that it might be useful to have a broad repertoire of strategies” (p. 37). Would it not be possible to provide these analyses with the current data, similar to Werner, Wu et al. (2022) as an additional exploratory analysis (perhaps reported in the SOM)? It appears to me that calculating a repertoire size (e.g., using a sum index, see Werner, Wu et al., 2022) might be a good way to utilize data gathered from the SCSS in general. This might be especially relevant when considering the following point.
3. Some of the strategies are described as “maladaptive” (e.g., on p. 37). Here, I find it important to discuss that describing some strategies as maladaptive is somewhat dangerous (see the “fallacy of uniform efficacy” in emotion regulation research; Bonanno & Burton, 2013). It is likely that strategy efficacy depends on many factors, for example, the context and the person using the strategy (see Hennecke & Bürgler, 2020). Therefore, strategies should be used flexibly and context-sensitively, and strategies that might be effective in some situations might not work in others (e.g., Wenzel et al., 2023). The SCSS assesses strategy use very broadly across situations (this issue was briefly discussed in the limitations on p. 41), which is why such nuances are lost. I think it is valuable to provide insight into what strategies are generally effective, but it might be important to clearly describe it as such a “general efficacy”. Subsequently, it might be important to discuss the possibility that this might not necessarily mean that they are maladaptive in every situation and for every person.
Minor points
4. Regarding the additions made from the previous version, there was one small addition for which I would have liked a short explanation (perhaps in a footnote), which was the addition regarding the PHQ-9 “We only assessed eight of the items, not including the measure for suicidal thoughts and tendencies.” (p. 19).
5. When introducing the different strategies on p. 9, it might be confusing to some readers to see “changing environments” as part of situation selection strategies and not situation modification, as they are commonly descried as two separate groups of strategies that can be subsumed under “situational strategies” (e.g., Duckworth et al., 2016, p. 40). Similarly, on p.12 and p.13, two strategies are described as “situation selection”, which I would clearly describe as situation modification, i.e., “getting rid of one`s dryer” and “turning off the wifi automatically to go to bed earlier”. Therefore, a small addition in the introduction of the strategies might be needed to explain that you consider both situation modification and situation selection strategies under the term “situation selection”, similar to what Katzir et al. (2021) did, by treating both situation modification (or “stimulus control”) and situation selection strategies as one type of strategy, because they “load onto the same factor and were therefore combined into one final subscale (situation selection)" (p. 5).
6. Regarding the thresholds used for the R2 and adj. R2 (e.g., “R2 < .26” on p. 23). While the same thresholds were defined in Stage 1, I have now realized that there appears to be no reference to justify those exact thresholds.
References
Bonanno, G. A., & Burton, C. L. (2013). Regulatory flexibility: An individual differences perspective on coping and emotion regulation. Perspectives on Psychological Science, 8, 591–612. https://doi.org/10.1177/1745691613504116
Bürgler, S., Kleinke, K., & Hennecke, M. (2022). The Metacognition in Self-Control Scale (MISCS). Personality and Individual Differences 199. https://doi.org/10.1016/j.paid.2022.111841
Duckworth, A. L., Gendler, T. S., & Gross, J. J. (2016). Situational Strategies for Self-Control. Perspectives on Psychological Science, 11(1), 35-55. https://doi.org/10.1177/1745691615623247
Gardner, B., & Rebar, A. L. (2019). Habit formation and behavior change. In Oxford research encyclopedia of psychology. https://doi.org/10.1093/acrefore/9780190236557.013.129
Katzir, M., Baldwin, M., Werner, K. M., & Hofmann, W. (2021). Moving beyond Inhibition: Capturing a Broader Scope of the Self-Control Construct with the Self-Control Strategy Scale (SCSS). Journal of Personality Assessment, 103(6), 762–776. https://doi.org/10.1080/00223891.2021.1883627
Wenzel, M., Bürgler, S., Brandstätter, V., Kreibich, A., & Hennecke, M. (2024). Self-Regulatory Strategy Use, Efficacy, and Strategy-Situation-Fit in Self-Control Conflicts of Initiation, Persistence, and Inhibition. European Journal of Personality, 38(2), 189-208. https://doi.org/10.1177/08902070221150478
Werner, K. M., Wu, R., Gross, J., & Friese, M. (2022). When Bigger is Better: Size of Strategy Repertoire Predicts Goal Attainment [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/5uvxg
Werner, K. M., Inzlicht, M., & Ford, B. Q. (2022). Whither inhibition? Current Directions in Psychological Science, 31(4), 333-339. https://doi.org/10.1177/09637214221095848