Recommendation

Synergistic Mindset Intervention for Competitive Situations

Veli-Matti Karhulahti based on reviews by Lee Moore, Ivan Ropovik , Ivana Piterová and Jacob Keech

A recommendation of:

STAGE 1

Optimizing Esports Performance Using a Synergistic Mindsets Intervention

Maciej Behnke, Daniël Lakens, Kate Petrova, Patrycja Chwiłkowska, Lukasz D. Kaczmarek, Jeremy P. Jamieson, James J. Gross https://osf.io/sp6av version 3

Read report on server

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Optimizing Esports Performance Using a Synergistic Mindsets Intervention

Affective responses during stressful high-stakes situations can play an important role in shaping performance outcomes. For example, feeling shaky and nervous at a job interview can undermine performance, whereas feeling pumped and excited during a sporting competition can optimize performance. These observations suggest that affect regulation – the way people influence their affective responses – might play a key role in determining high-stakes performance outcomes. To test this hypothesis, we propose to adapt a newly developed synergistic mindsets intervention (Yeager et al., 2022) to high-stakes situations. This adaptation is motivated by the idea that (1) mindsets both about situations and one’s response to situations can be shaped to maximize challenge versus threat responding, and (2) challenge versus threat affective responses will be associated with enhanced performance outcomes. Our particular focus is esports, a context that permits the measurement of affective response – affective experience and real-time cardiovascular responses - and well-defined performance outcomes. After a baseline performance task, we will randomly assign gamers (N = 250) either to a synergistic mindsets intervention or to a control condition in which they will learn brain facts. After two weeks of daily gaming, players will compete in a cash-prize tournament. We will measure affective experiences before the matches and cardiovascular responses before and throughout the matches. Compared to the control condition, we hypothesize that synergistic mindset gamers will show greater challenge affective responses and superior performance outcomes. If these predictions are supported, we will seek to extend this work to other contexts.

Affect, Esports, Reappraisal, Challenge, Threat, Mindset,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

تحسين أداء الرياضات الإلكترونية باستخدام تدخل العقليات التآزرية

يمكن أن تلعب الاستجابات العاطفية أثناء المواقف العصيبة ذات المخاطر العالية دورًا مهمًا في تشكيل نتائج الأداء. على سبيل المثال، الشعور بالاهتزاز والتوتر في مقابلة عمل يمكن أن يقوض الأداء، في حين أن الشعور بالإثارة والإثارة أثناء المنافسة الرياضية يمكن أن يؤدي إلى تحسين الأداء. تشير هذه الملاحظات إلى أن التنظيم المؤثر - الطريقة التي يؤثر بها الناس على استجاباتهم العاطفية - قد يلعب دوراً رئيسياً في تحديد نتائج الأداء عالية المخاطر. لاختبار هذه الفرضية، نقترح تكييف تدخل العقليات التآزرية المطور حديثًا (Yeager et al., 2022) مع المواقف عالية المخاطر. الدافع وراء هذا التكيف هو فكرة أن (1) العقليات المتعلقة بالمواقف واستجابة الفرد للمواقف يمكن تشكيلها لتحقيق أقصى قدر من التحدي مقابل الاستجابة للتهديد، و(2) الاستجابات العاطفية للتحدي مقابل التهديد سترتبط بنتائج الأداء المحسنة. ينصب تركيزنا بشكل خاص على الرياضات الإلكترونية، وهو السياق الذي يسمح بقياس الاستجابة العاطفية - التجربة العاطفية واستجابات القلب والأوعية الدموية في الوقت الحقيقي - ونتائج الأداء المحددة جيدًا. بعد مهمة الأداء الأساسية، سنقوم بتعيين اللاعبين بشكل عشوائي (العدد = 250) إما لتدخل العقليات التآزرية أو لحالة تحكم حيث سيتعلمون حقائق الدماغ. بعد أسبوعين من اللعب اليومي، سيتنافس اللاعبون في بطولة ذات جوائز نقدية. سنقوم بقياس التجارب العاطفية قبل المباريات واستجابات القلب والأوعية الدموية قبل المباريات وخلالها. بالمقارنة مع حالة التحكم، فإننا نفترض أن اللاعبين ذوي العقلية التآزرية سيظهرون استجابات عاطفية أكبر للتحدي ونتائج أداء متفوقة. إذا تم دعم هذه التوقعات، فسوف نسعى إلى توسيع نطاق هذا العمل ليشمل سياقات أخرى.

التأثير، الرياضات الإلكترونية، إعادة التقييم، التحدي، التهديد، العقلية،

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Optimización del rendimiento de los deportes electrónicos mediante una intervención de mentalidad sinérgica

Las respuestas afectivas durante situaciones estresantes de alto riesgo pueden desempeñar un papel importante en la configuración de los resultados del desempeño. Por ejemplo, sentirse tembloroso y nervioso en una entrevista de trabajo puede socavar el rendimiento, mientras que sentirse animado y entusiasmado durante una competición deportiva puede optimizar el rendimiento. Estas observaciones sugieren que la regulación afectiva (la forma en que las personas influyen en sus respuestas afectivas) podría desempeñar un papel clave en la determinación de resultados de desempeño de alto riesgo. Para probar esta hipótesis, proponemos adaptar una intervención de mentalidad sinérgica recientemente desarrollada (Yeager et al., 2022) a situaciones de alto riesgo. Esta adaptación está motivada por la idea de que (1) las mentalidades tanto sobre las situaciones como sobre la propia respuesta a las situaciones pueden moldearse para maximizar la respuesta al desafío frente a la amenaza, y (2) las respuestas afectivas al desafío frente a la amenaza se asociarán con mejores resultados de desempeño. Nuestro enfoque particular son los deportes electrónicos, un contexto que permite medir la respuesta afectiva (experiencia afectiva y respuestas cardiovasculares en tiempo real) y resultados de rendimiento bien definidos. Después de una tarea de rendimiento inicial, asignaremos aleatoriamente a los jugadores (N = 250) a una intervención de mentalidad sinérgica o a una condición de control en la que aprenderán datos cerebrales. Después de dos semanas de juego diario, los jugadores competirán en un torneo con premios en efectivo. Mediremos experiencias afectivas antes de los partidos y respuestas cardiovasculares antes y durante los partidos. En comparación con la condición de control, planteamos la hipótesis de que los jugadores con mentalidad sinérgica mostrarán mayores respuestas afectivas de desafío y resultados de rendimiento superiores. Si estas predicciones se confirman, buscaremos extender este trabajo a otros contextos.

Afecto, Deportes electrónicos, Reevaluación, Desafío, Amenaza, Mentalidad,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Optimiser les performances de l'e-sport à l'aide d'une intervention synergique

Les réponses affectives lors de situations stressantes à enjeux élevés peuvent jouer un rôle important dans l'évolution des résultats en matière de performance. Par exemple, se sentir tremblant et nerveux lors d'un entretien d'embauche peut nuire aux performances, tandis que se sentir motivé et excité lors d'une compétition sportive peut optimiser les performances. Ces observations suggèrent que la régulation affective – la façon dont les gens influencent leurs réponses affectives – pourrait jouer un rôle clé dans la détermination des résultats de performance à enjeux élevés. Pour tester cette hypothèse, nous proposons d'adapter une nouvelle intervention sur les mentalités synergiques (Yeager et al., 2022) à des situations à enjeux élevés. Cette adaptation est motivée par l’idée que (1) les mentalités à la fois sur les situations et sur la réponse de chacun aux situations peuvent être façonnées pour maximiser la réponse au défi par rapport à la menace, et (2) les réponses affectives au défi par rapport à la menace seront associées à de meilleurs résultats de performance. Nous nous concentrons particulièrement sur l'esport, un contexte qui permet de mesurer la réponse affective (expérience affective et réponses cardiovasculaires en temps réel) et des résultats de performance bien définis. Après une tâche de performance de base, nous assignerons au hasard les joueurs (N = 250) soit à une intervention synergique sur les états d'esprit, soit à une condition de contrôle dans laquelle ils apprendront des faits sur le cerveau. Après deux semaines de jeu quotidien, les joueurs participeront à un tournoi avec des prix en espèces. Nous mesurerons les expériences affectives avant les matchs et les réponses cardiovasculaires avant et tout au long des matchs. Par rapport à la condition de contrôle, nous émettons l'hypothèse que les joueurs à l'état d'esprit synergique montreront de plus grandes réponses affectives de défi et des résultats de performance supérieurs. Si ces prédictions se confirment, nous chercherons à étendre ce travail à d'autres contextes.

Affect, Esports, Réévaluation, Défi, Menace, État d'esprit,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

सिनर्जिस्टिक माइंडसेट हस्तक्षेप का उपयोग करके ईस्पोर्ट्स प्रदर्शन को अनुकूलित करना

तनावपूर्ण उच्च जोखिम वाली स्थितियों के दौरान प्रभावशाली प्रतिक्रियाएँ प्रदर्शन परिणामों को आकार देने में महत्वपूर्ण भूमिका निभा सकती हैं। उदाहरण के लिए, नौकरी के लिए इंटरव्यू के दौरान झिझक और घबराहट महसूस करना प्रदर्शन को कमजोर कर सकता है, जबकि खेल प्रतियोगिता के दौरान उत्साहित और उत्साहित महसूस करना प्रदर्शन को अनुकूलित कर सकता है। इन टिप्पणियों से पता चलता है कि विनियमन को प्रभावित करना - जिस तरह से लोग अपनी भावनात्मक प्रतिक्रियाओं को प्रभावित करते हैं - उच्च-स्तरीय प्रदर्शन परिणामों को निर्धारित करने में महत्वपूर्ण भूमिका निभा सकते हैं। इस परिकल्पना का परीक्षण करने के लिए, हम एक नव विकसित सहक्रियात्मक मानसिकता हस्तक्षेप (येजर एट अल., 2022) को उच्च जोखिम वाली स्थितियों के अनुकूल बनाने का प्रस्ताव करते हैं। यह अनुकूलन इस विचार से प्रेरित है कि (1) स्थितियों और स्थितियों के प्रति किसी की प्रतिक्रिया दोनों के बारे में मानसिकता को चुनौती बनाम खतरे की प्रतिक्रिया को अधिकतम करने के लिए आकार दिया जा सकता है, और (2) चुनौती बनाम खतरे की प्रतिक्रियात्मक प्रतिक्रियाएं बेहतर प्रदर्शन परिणामों के साथ जुड़ी होंगी। हमारा विशेष ध्यान ईस्पोर्ट्स पर है, एक ऐसा संदर्भ जो भावात्मक प्रतिक्रिया - भावात्मक अनुभव और वास्तविक समय हृदय संबंधी प्रतिक्रियाएं - और अच्छी तरह से परिभाषित प्रदर्शन परिणामों को मापने की अनुमति देता है। बेसलाइन प्रदर्शन कार्य के बाद, हम गेमर्स (एन = 250) को या तो एक सहक्रियात्मक मानसिकता हस्तक्षेप या एक नियंत्रण स्थिति में यादृच्छिक रूप से असाइन करेंगे जिसमें वे मस्तिष्क तथ्यों को सीखेंगे। दो सप्ताह के दैनिक गेमिंग के बाद, खिलाड़ी नकद-पुरस्कार टूर्नामेंट में प्रतिस्पर्धा करेंगे। हम मैचों से पहले के भावात्मक अनुभवों और मैचों से पहले और उसके दौरान हृदय संबंधी प्रतिक्रियाओं को मापेंगे। नियंत्रण स्थिति की तुलना में, हम परिकल्पना करते हैं कि सहक्रियात्मक मानसिकता वाले गेमर्स अधिक चुनौती वाली प्रभावशाली प्रतिक्रियाएँ और बेहतर प्रदर्शन परिणाम दिखाएंगे। यदि ये भविष्यवाणियाँ समर्थित हैं, तो हम इस कार्य को अन्य संदर्भों तक विस्तारित करने का प्रयास करेंगे।

प्रभाव, निर्यात, पुनर्मूल्यांकन, चुनौती, खतरा, मानसिकता,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

相乗的な考え方の介入を使用して e スポーツのパフォーマンスを最適化する

ストレスの多い一か八かの状況における感情的な反応は、パフォーマンスの結果を形作る上で重要な役割を果たす可能性があります。たとえば、就職面接で動揺して緊張するとパフォーマンスが低下する可能性がありますが、スポーツ競技中に興奮して興奮するとパフォーマンスが最適化されます。これらの観察は、感情規制、つまり人々が感情的な反応に影響を与える方法が、一か八かのパフォーマンスの結果を決定する上で重要な役割を果たしている可能性があることを示唆しています。この仮説を検証するために、新しく開発された相乗的マインドセット介入 (Yeager et al., 2022) を一か八かの状況に適応させることを提案します。この適応は、(1) 状況と状況に対する人の反応に関する考え方は、挑戦と脅威への対応を最大化するように形成できる、(2) 挑戦と脅威の感情的反応はパフォーマンスの向上に関連する、という考えによって動機づけられています。私たちが特に焦点を当てているのは、感情反応（感情体験とリアルタイムの心血管反応）と明確に定義されたパフォーマンス結果の測定を可能にするコンテキストである e スポーツです。ベースラインのパフォーマンスタスクの後、ゲーマー (N = 250) を、相乗的なマインドセット介入または脳の事実を学習するコントロール条件のいずれかにランダムに割り当てます。 2 週間毎日ゲームを続けた後、プレーヤーは賞金付きのトーナメントに出場します。私たちは試合前の感情的な経験と、試合前および試合中の心血管反応を測定します。対照条件と比較して、相乗的な考え方のゲーマーは、より大きな挑戦感情反応と優れたパフォーマンス結果を示すだろうという仮説を立てています。これらの予測が支持された場合、私たちはこの研究を他の状況にも拡張するよう努めます。

影響、eスポーツ、再評価、挑戦、脅威、考え方、

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Otimizando o desempenho dos esportes eletrônicos usando uma intervenção de mentalidade sinérgica

As respostas afetivas durante situações estressantes de alto risco podem desempenhar um papel importante na definição dos resultados de desempenho. Por exemplo, sentir-se instável e nervoso numa entrevista de emprego pode prejudicar o desempenho, enquanto sentir-se entusiasmado e entusiasmado durante uma competição desportiva pode optimizar o desempenho. Estas observações sugerem que a regulação afectiva – a forma como as pessoas influenciam as suas respostas afectivas – pode desempenhar um papel fundamental na determinação de resultados de desempenho de alto risco. Para testar esta hipótese, propomos adaptar uma intervenção de mentalidades sinérgicas recentemente desenvolvida (Yeager et al., 2022) a situações de alto risco. Esta adaptação é motivada pela ideia de que (1) as mentalidades sobre as situações e a resposta de cada um às situações podem ser moldadas para maximizar a resposta ao desafio versus a resposta à ameaça, e (2) as respostas afetivas ao desafio versus a ameaça estarão associadas a melhores resultados de desempenho. Nosso foco particular são os esportes eletrônicos, um contexto que permite a medição da resposta afetiva – experiência afetiva e respostas cardiovasculares em tempo real – e resultados de desempenho bem definidos. Após uma tarefa de desempenho de linha de base, atribuiremos aleatoriamente os jogadores (N = 250) a uma intervenção de mentalidade sinérgica ou a uma condição de controle na qual eles aprenderão fatos cerebrais. Após duas semanas de jogos diários, os jogadores competirão em um torneio com prêmios em dinheiro. Mediremos as experiências afetivas antes das partidas e as respostas cardiovasculares antes e durante as partidas. Em comparação com a condição de controle, levantamos a hipótese de que os jogadores com mentalidade sinérgica apresentarão maiores respostas afetivas ao desafio e resultados de desempenho superiores. Se estas previsões forem apoiadas, procuraremos estender este trabalho a outros contextos.

Afeto, Esports, Reavaliação, Desafio, Ameaça, Mentalidade,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Оптимизация производительности киберспорта с помощью синергетического мышления

Аффективная реакция во время стрессовых ситуаций, в которых ставки высоки, может сыграть важную роль в формировании результатов деятельности. Например, чувство тряски и нервозности на собеседовании может подорвать производительность, тогда как чувство взволнованности и возбуждения во время спортивных соревнований может улучшить производительность. Эти наблюдения показывают, что регулирование аффекта – то, как люди влияют на свои аффективные реакции – может играть ключевую роль в определении высоких результатов в производительности. Чтобы проверить эту гипотезу, мы предлагаем адаптировать недавно разработанное синергетическое вмешательство в мышлении (Yeager et al., 2022) к ситуациям с высокими ставками. Эта адаптация мотивирована идеей о том, что (1) образ мышления как в отношении ситуаций, так и реакции человека на ситуации может быть сформирован так, чтобы максимизировать реагирование на вызов или угрозу, и (2) аффективные реакции на вызов или угрозу будут связаны с улучшением результатов производительности. Наше особое внимание уделяется киберспорту, контексту, который позволяет измерять аффективную реакцию – аффективный опыт и сердечно-сосудистые реакции в реальном времени – и четко определенные результаты производительности. После базового задания на производительность мы случайным образом назначим геймеров (N = 250) либо на синергетическое вмешательство в мышлении, либо на контрольное условие, в котором они будут изучать факты о мозге. После двух недель ежедневных игр игроки сразятся в турнире с денежными призами. Мы будем измерять эмоциональные переживания перед матчами и реакцию сердечно-сосудистой системы до и во время матчей. Мы предполагаем, что по сравнению с контрольным состоянием геймеры с синергическим мышлением будут демонстрировать более сильные аффективные реакции и более высокие результаты в производительности. Если эти прогнозы подтвердятся, мы постараемся распространить эту работу на другие контексты.

Влияние, Киберспорт, Переоценка, Вызов, Угроза, Мышление,

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

使用协同心态干预优化电子竞技表现

在压力大的高风险情况下的情感反应可以在塑造绩效结果方面发挥重要作用。例如，在工作面试中感到颤抖和紧张可能会影响表现，而在体育比赛中感到兴奋和兴奋可以优化表现。这些观察结果表明，影响调节——人们影响其情感反应的方式——可能在决定高风险绩效结果方面发挥关键作用。为了检验这一假设，我们建议将新开发的协同心态干预措施（Yeager 等人，2022）应用于高风险情况。这种适应的动机是：（1）关于情况和个人对情况的反应的心态可以被塑造以最大化挑战与威胁响应，以及（2）挑战与威胁情感反应将与增强的绩效结果相关联。我们特别关注电子竞技，这是一种可以测量情感反应（情感体验和实时心血管反应）以及明确定义的表现结果的环境。在完成基线表现任务后，我们将随机分配游戏玩家（N = 250）进行协同思维干预或控制条件，在控制条件下他们将学习大脑事实。经过两周的日常游戏后，玩家将参加现金奖励锦标赛。我们将测量赛前的情感体验以及赛前和比赛期间的心血管反应。与控制条件相比，我们假设协同心态游戏玩家将表现出更大的挑战情感反应和卓越的表现结果。如果这些预测得到支持，我们将寻求将这项工作扩展到其他环境。

影响、电子竞技、重新评估、挑战、威胁、心态、

Submission: posted 04 January 2023
Recommendation: posted 25 March 2023, validated 27 March 2023

Cite this recommendation as:
Karhulahti, V.-M. (2023) Synergistic Mindset Intervention for Competitive Situations. Peer Community in Registered Reports, . https://rr.peercommunityin.org/PCIRegisteredReports/articles/rec?id=364

Related stage 2 preprints:

Applying a Synergistic Mindsets Intervention to an Esports Context
Maciej Behnke, Daniël Lakens, Kate Petrova, Patrycja Chwiłkowska, Szymon Jęśko Białek, Maciej Kłoskowski, Wadim Krzyżaniak, Patryk Maciejewski, Lukasz D. Kaczmarek, Kacper Szymański, Jeremy P. Jamieson, James J. Gross
https://doi.org/10.17605/OSF.IO/WSG28

Recommendation

Mindset theories suggest that the mere belief in the malleability of human abilities can already help one to develop related performance. On the other hand, one and the same performance situation can also be experienced in various affective ways, which differently contribute to performance outcomes. Arguably, appraising a performance situation as a “threat” instead of “challenge” is associated with maladaptive responses, such as impaired cardiovascular mobilization. If people could experience performance situations as positive challenges, this might also improve performance outcomes. Drawing from these connected theoretical premises, the synergistic mindset intervention was developed and tentatively found to help adolescents in stressful situations (Yeager et al., 2022).

In the present registered report, Behnke et al. (2023) build on the above and test whether the synergistic mindset intervention can help individuals in competitive gaming situations. The authors utilize one of the leading esport games, Counter-Strike: Global Offensive, and recruit its active players into randomized control and intervention groups for two weeks. Ultimately, the participants compete in a cash-prize tournament involving measures of affective experience and cardiovascular responses. Behnke et al. (2023) hypothesize that the synergistic mindset group will show greater challenge affective responses and superior performance outcomes. As such, the study design has significant potential to generate valuable evidence for various theoretical models and the synergistic mindset model in particular.

The Stage 1 manuscript was evaluated over two rounds by four experts with experimental psychology specializations in mindsets, stress, and statistics. Based on the comprehensive responses to the reviewers' feedback, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).

URL to the preregistered Stage 1 protocol: https://osf.io/z3adb

Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.

List of eligible PCI RR-friendly journals:

References

Behnke M., Lakens D., Petrova K., Chwiłkowska P., Kaczmarek L. D., Jamieson J. P., & Gross J. J. (2023) Optimizing Esports Performance Using a Synergistic Mindsets Intervention. In principle acceptance of Version 3 by Peer Community in Registered Reports. https://osf.io/z3adb

Yeager D.S., Bryan C.J., Gross J.J., Murray J., Krettek D., Santos P., ... & Jamieson J.P. (2022) A synergistic mindsets intervention protects adolescents from stress. Nature 607, 512–520. https://doi.org/10.1038/s41586-022-04907-7

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Reviews

Evaluation round #2

DOI or URL of the report: https://osf.io/bfztd

Version of the report: https://osf.io/bfztd

Author's Reply, 24 Mar 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100364.ar2

Decision by Veli-Matti Karhulahti, posted 21 Mar 2023, validated 21 Mar 2023

Dear Maciej Behnke and co-authors,

Thank you for all careful revisions and responses. I have now received all reviews and the reviewers collectively agree that the work is almost ready for in-principle acceptance. There are a few minor reviewer comments that I encourage you to consider in your final revision. Note that this round we had one more expert who was unable join the review in the first round – this ensures that, at Stage 2, we will have experts who are familiar with the Stage 1 plan even if someone is unavailable next year. I leave a few brief notes of my own.

1. As a follow-up to my earlier comment #3 where I referred to the gaming disorder scale as an exclusion criterion, I think it’s worth giving it a bit more thought. Since participants are paid for playing games and you might learn that some meet gaming-related diagnostic criteria, it could strengthen the study to have a more explicit plan regarding participants in this hypothetical risk group.
2. I suggest a minor revision for the justification of exclusions (p. 13). The age limit is clear, but the other exclusions don’t seem to follow logically: “We will recruit Polish-speaking players as the study will be run in Poland. We will recruit only male players due to their predominance (76%) among first-person shooter gamers.” I believe in both cases the justification is feasibility, along the following lines: “Because including non-Polish and non-male participants would entail producing and testing different sets of group-specific research materials, the study will include only Polish male players” (just an example, feel free to rephrase as you see best or rebut).
3. This comment doesn’t need a response, but I want to leave it at Stage 1 in case it will be discussed at Stage 2. Note that in the design table column “theory that could be shown wrong” you only name the synergistic mindset model. I agree it’s good to be very careful and selective about theoretical inference, but at the same time I am thinking whether the results might be theoretically informative also beyond this single model – after all, the synergistic model stands on other established theory. Preregistered meta-theoretical inference for the upcoming discussion section could be informative for the research program’s development at larger scientific scale.

I am aware that your time is scarce, so I can promise to deliver you the final decision letter in 48h of receiving the next version. This should give you 1 week for revisions and still be able to receive the decision this month.

Sincerely,
Veli-Matti Karhulahti

https://doi.org/10.24072/pci.rr.100364.d2

Reviewed by Ivan Ropovik , 10 Mar 2023

Thanks to the authors for considering my suggestions. As I already expressed in my review of the first submitted version, I think that the proposed study will be informative and may serve as one of the good-practice examples in the field. Therefore, not then, not now do I see any “disqualifying factors”, to use authors’ words. Being pragmatic about research has its merits. Every study has strengths and weaknesses and it is completely fine as long as the writing is clear in that manner and provided that the weaknesses of the design do not disproportionately warp the reflection of the underlying studied phenomena. My two main worries about the reliability of the measurement and the unbalanced demand characteristics of experiment conditions remain, but I also get the other side of the coin, namely that the authors chose to optimize for a greater cumulative potential of the proposed research with respect to the existing evidence in the field. Any one of these two possibly biasing factors (measurement error does not have to be just random noise) or their interaction can lead to false positive results, so laying out these possible weak links in the limitations section seems important to me.

Below, I offer a few follow-up comments/responses on authors’ edits and replies. Regarding the issues that I do not return to, I was either satisfied or okay with the proposed revision/rebuttal. Overall, I think that there are no outstanding issues that should prevent the authors from running the study as proposed and am happy to hand over the final call on any further revisions to the discretion of the editor.

1. “As requested, we have included a critical interpretation of the existing literature in the Introduction. However, we respectfully disagree that psychophysiological challenge/threat or affect regulation research provides “weakly informative designs.” But, we acknowledge that you might have a different opinion on this topic.”

As I said, I have no expertise in literature on reappraisal. Why I made such a bold claim? Some time ago, I was doing an internal review of a protocol of one large multi-site study studying the effect of a similar reappraisal intervention. The lead authors explicitly chose this type of affect regulation intervention because it proved to have the strongest effect in the meta-analysis by Webb et al. (2012). In my view, this is always a poor strategy, for various reasons. This is also the metaanalysis used in the present RR to get an expected effect size. For the sake of the review and protocol revision in that past study, I looked at the included studies in detail and I carried out a re-analysis using arguably more state-of-the-art methods. I also looked at studies from another systematic review of reappraisal interventions by Cohen & Ochsner (2018). What I found and documented was an array of studies having various methodological issues (sizeable proportion of experiments lacking a control group), mostly feeble manipulations on tiny samples yielding huge effects, indications of selective reporting (p-values < .01 completely lacking, e.g., six out of seven available focal tests of the claims for mixed reappraisal had a p-value ~ .04), or study-level data patterns inconsistent with expectations under both, H0 or H1. Since you are dealing with the reappraisal literature, that is why I tried to voice my concerns about the robustness of the given literature and a need for a more critical appraisal of the evidence reported in that literature. Of course, I looked only at a slice of that literature, but that slice did not spur much confidence, to say the least.

If interested, I’m putting this part of my review dealing with the re-analysis of Webb et al. here: https://docs.google.com/document/d/1Q_8134QurWIKdUEzmJRjs7SZN9aJpUiPnZxbTe_3KiA/edit?usp=sharing

Analytic output is available here: https://rpubs.com/ivanropovik/592468
Code with data are available here: https://github.com/iropovik/PSAcovid002review

No need to react to that on your part, I just wanted to back up the bold claim from my review of your RR.

2. Response letter: “The PCI RR community might be up to date with the novel approaches for sample estimation” … expected effect sizes based on the scientific literature are often also effects of interest to scholars in these fields (Lakens, 2022).

Offering an overview of effect sizes for the given effect from past literature is fine. Trying to provide a sample size justification (in a field where it is still rare) is great. So I’m not insisting that it is to be removed. But I cannot help myself but seeing a “power estimation” or “sample estimation” based on previous findings or expectations in general as just bad practice that exacerbates the issue of studies having low power to detect effect sizes that would be considered theoretically relevant. Power is a pre-data concept, just like alpha. Specifically, it is the sensitivity of a given statistical test to reliably detect a range of *hypothetical* population effects of interest. Btw, that is the definition also put forward by Morey & Lakens (2016). This paper lists 4 misconceptions of power and the treatment of power in the present RR is in fact an example of misconceptions #3 (Sample size choice should be based on previous results) and #4 (Sample size choice should be based on what the effect is believed to be). Implicitly also misconception #1 (Experiments have actual/observed/post hoc power). From the paper: “the power of a test depends not on what the effect size is, but rather on all the hypothetical values it could be. … Attempts to “estimate” the power of an experiment, or a field, are based on a mis- understanding of power and are uninformative and confusing (Goodman & Berlin, 1994; Hoenig & Heisey, 2001; O’Keefe, 2007).” (p. 17). There is also still conflation (in the manuscript and response letter) of power analysis with sample size determination (the issue of powering for primary vs secondary hypotheses).

Anyway, if the sample size determination was mainly based on resource constraints (beta = .22 would seem a strange target to many readers), I think it is completely fine (and the most honest way) to say so and just let the reader see the sensitivity of the design across a range of effect sizes (or alternatively, across Ns as you do). IMHO, that would be better than reinforcing misconceptions about power and touring the reader through a determination of SESOI and wide array of past ESs, and arriving at the inability to formally reconciliate the two approaches and setting an arbitrary target of r = .22. That said, all this is inconsequential w.r.t. the informativeness of the present design, so let’s leave it at that.

3. Response letter: “We added the attention check in the questionnaire sets to screen for careless responding. Now item #59 states:“ “Please select "Strongly disagree" for this item to show that you are paying attention.”

Just reiterating from my review exclusion of careless responders should only be applied using pre-treatment measures, as carelessness itself may have been affected by the treatment, and exclusions based on that would induce bias into your model. You may also consider the methods for detecting careless responding patterns in the “careless” R package. One attention check is better than nothing, but there are more advanced methods.

4. Response letter: “Our software (Microsoft Forms) does not allow us to randomize the order of the items within the scales. … As some studies suggest, the order in which items are presented or listed is not associated with any significant negative consequences (Schell et al., 2013) and does not cause differences in average scores (Weinberg et al., 2018).”

There may or may not be order effects, as it is a highly idiosyncratic effect tied to the actual content of what is being measured. Within-block randomization is the way to play it safe. You can always use say three different forms. For instance, just three forms have been found to perform relativelly well compared to the full randomization in planned missingness designs, if the latter is not possible (e.g., paper-pencil data collection). The downside is a bit more complicated administration and joining of the data. I leave it upon authors’ discretion if they think it is worth it.

5. “If we find biologically impossible values, we will delete them. We will report the number of outliers for a given variable.”

Not necessary. I’d rather try to be very conservative with the exclusion of outliers (or specific values) just as you propose and only remove very improbable or impossible values. What is more important than, e.g., counting the number of outliers, is to run the entire analysis twice, with the MAD > 3 outliers in, out, and thus checking if the decision to remove outliers or not has any material impact on any of the main substantive findings. Reporting even that in a short paragraph in the results would be nice, IMO.

6. “The support for the hypotheses will be provided if the models fit the data well, i.e., RMSEA < .06; SRMR < .0 08, CFI > .95, χ2 > .05 (Bentler, 1990)”

It is the p-value of chi^2 that should be > .05. Your model will fit very well if the actual chi^2 value would approximate the degrees of freedom. Having a threshold for the chi^2 value would be uncommon. There is also a typo in SRMR threshold.

7. “If the fit indices suggest model misfit, we will not be interpreting effect sizes.”

A df = 37 model does not offer a hell of a lot dimensions of data space along which the model could be rejected, but from my experience, there is a pretty decent chance that the chi^2 will point to significant global misfit between the model and the data. This is a really risky little note in an RR :D

Anyway, this is a terribly strict requirement. I’m definitely not saying to disregard evidence against the exact or even approximate fit. Chi^2 test is the only formal test of a model and the best guard against misspecified models. A significant chi^2 only tells you that there may be a misspecification in the model and that you need to take a closer look. Taking a deeper look is essential in such case. I think it is reasonable to plan the following:

If the exact model-data fit hypothesis will be rejected, a set of careful diagnostic procedures to identify the possible local sources of causal misfit will be carried out (examining the matrix of residuals and modification indices). The fit would therefore be regarded as adequate if either (1) the exact fit test (chi^2 test) did not signal significant discrepancies between the data and the model or (2) if there was no larger pattern of substantial residuals (say > .1) indicating systematic local misfit.

With some reservations, a disconfirmed model can still be useful and its estimates can still have interpretational value provided that the fit of the model is not very bad – CFI, TLI way below .9, chi^2 being higher multiples of the model df. You also need a contingency plan for model modification, if this is the case. Half data-driven, half theory-driven careful modifications are less of an evil than interpreting a badly fitting model, where the serious misspecifications propagate through the entire model (or throwing the data away).

8. “We did not find strong enough evidence on whether these factors moderate the effects of synergistic mindset intervention on cardiovascular and performance outcomes (Yeager et al., 2022) to include them in the primary model.”

Effectively, only examining effect that proved to be significant in past research goes against the nature of scientific inquiry. If you are not interested to testing a moderation hypothesis, that is completely fine, but I think it should be said directly – that you chose not to test moderation within your primary model. Period. The current justification is a bit awkward.

9. Response letter: “However, we would like to keep the option of using the overall negative and positive affective experience scores by averaging the four negative affective experiences in the exploratory analysis. Although it might not be a pure robustness check for our conclusions, in this way, we will be able to observe the difference between the most popular operationalizations of affective experience and statistically superior options.

Such contingency is completely fine and even desirable! What I was objecting only is to qualify the robustness of the results by using a psychometrically inferior model – an unweighted sum score.

10. Response letter: “If we cannot use multiverse analysis, we will run multiple models and report the results in supplementary materials. After eliminating the different operationalizations of affective experience, we counted 72 possible models (3 options for affective experience x 8 options for cardiovascular measures x 3 options for game measures). This analysis aims to describe the range of effect estimates based on all reasonable data analytical decisions.”

Completely agree about the fact that planning things like multiverse analyses is not a concern at this stage.

That said, what you are describing is in fact a form of multiverse analysis! You don’t need any expert on that, IMO. It’s fine just to run all these possible models representing different design options and report what distribution of effect sizes for a few selected focal estimates did you find. Sure, having a script running through all combinations and spitting out a nice multiverse visualization is a great feature but you can do it easily later on “by hand” too. E.g., estimating the effect size and SE (or CIs) for each model, ordering by effect size and easily plotting them using a forest plot (see a very convenient forest() function in the metafor package). The reader would then easily see the distribution of effect sizes, what proportion their CIs cross zero, etc. Or, alternatively, even only briefly describing the distribution of effect sizes verbally would be far better than nothing.

Thanks again for the opportunity to discuss the design of this important study with you. Good luck with the study.

Best wishes,
Ivan Ropovik

https://doi.org/10.24072/pci.rr.100364.rev21

Reviewed by Lee Moore, 13 Mar 2023

The authors have done a good job of revising the registered report based on my feedback and suggestions (although it is worth them reading my published work more closely when describing how they will score the demand and resource evaluation data - i.e., substract evaluated demands from resources to get a score ranging from -5 to +5). While I could quibble with one or two of the authors responses, overall, the registered report is excellent and describes what will be a highly rigorous and superb piece of research in a comprehensive, accurate, and replicable way. It has been an interesting process reviewing this registered report and so thanks for the opportunity to be involved. I wish the authors all the best with the data collection and analysis phase and I look forward to seeing the final write-up in due course.

https://doi.org/10.24072/pci.rr.100364.rev22

Reviewed by Jacob Keech, 21 Mar 2023

As I have joined the review process after a very extensive and well-articulated first round of revisions, I have very few additional comments. I have read the manuscript, materials, and response to round 1 reviews thoroughly. My overall assessment is that this is a well-designed study which has been thoroughly described in the revised Stage 1 Report. The authors have also thoughtfully responded to the comments in the first round of reviews. The application of the synergistic mindset intervention to optimizing esports performance is an innovative idea and I am sure the results of the study will have substantive theoretical and practical value.

One minor point is that I note the concerns raised by Reviewer 2 about the control condition. I agree that the authors’ decision to retain the original procedure for the control condition is reasonable. This will allow comparability, and it has been tested using a large number of participants in the prior studies testing the synergistic mindset intervention. However, on page 12 of the revised manuscript, the control condition is still referred to as a “placebo control”. I recommend dropping the placebo wording as the Yeager et al. (2022) paper did not describe the control conditions as a placebo control, and as has been discussed in prior reviewer comments, it doesn’t appear that there are matched expectancies across conditions.

I wish the authors all the best with conducting their study, and I look forward to reading the about the results in the full paper.

https://doi.org/10.24072/pci.rr.100364.rev23

Reviewed by Ivana Piterová, 10 Mar 2023

The authors have incorporated my comment into the code, so I have no further comments on the code at this stage.

BR

https://doi.org/10.24072/pci.rr.100364.rev24

Evaluation round #1

DOI or URL of the report: https://osf.io/9zmp8

Version of the report: https://osf.io/9zmp8

Author's Reply, 07 Mar 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100364.ar1

Decision by Veli-Matti Karhulahti, posted 12 Feb 2023, validated 12 Feb 2023

Dear Authors,

Thank you for submitting a highly rigorous Stage 1 proposal to PCI RR and giving us the opportunity to assess it. I have now received three reviews, two with highly detailed feedback regarding various aspects of the study and one specifically verifying computational reproducibility. We had one unfortunate reviewer cancellation in the process, which delayed decision, but I believe the present reviews were worth the wait and provide highly useful comments that help making final improvements to your plan.

We fully agree with the reviewers that it’s important to have this study carried out, so please utilise the rich feedback in a way that is most useful for your purposes (considering practical limits). I add a few comments as well; again, take what is valuable and skip the rest.

1. As the reviewers poin out, it would be good have explicit inclusion/exclusion criteria. It is mentioned on p. 11 that inclusion requires 6h/week of CSGO, but how is this measured (baseline item #5?) and is this the only criterion? Does this mean veterans cannot participate if they don’t train/play anymore? What about age, language?
2. Related to the above, I am thinking whether it would be better to recruit participants based on rank or another performance indicator, rather than hours of training/play, considering performance is a hypothesis. It is possible that rank also partially explains affective experiences, so having min/max rank could help. Additionally, it feels important to control previous experience with the bot deathmatch used in the intervention; individuals who have already learned how to produce high scores in this mod could generate unwanted data (see another comment later).
3. Still about participants: it is mentioned that individuals with significant health problems will be excluded. Does this refer to standard cut-offs with the applied scales? I’m specifically looking at gaming disorder, which you measure with a DSM-5 scale; although I’m sceptical about the scale’s clinical validity, it would be a concern to recruit (or continue having) participants who meet gaming disorder criteria at screening. Consider moving this scale to baseline registration measures as an exclusion screener.
4. Some existing Polish scale translations are cited. It would be good report whether you will use your own translations (when you do) or if English versions are used (if they are). The supplementary materials are informative, but it would help to have this information clearly in the manuscript.
5. Health scale from Ware Jr & Sherbourne (1992) is reported to ask physical health (supplement p. 4), but the item of the original scale measures general health. Is the modification “psychical” part of the Polish translation?
6. I don’t want to further complicate scale selection (reviewers already address that in detail), but it also seems theoretically plausible for the synergetic mindset intervention to work specifically for individuals with low self-esteem (I believe this was part of Dwek’s original reasoning). I wonder if e.g. Rosenberg’s Self-Esteem Scale for exploratory or future analyses might be informative.
7. On p. 13 it is noted that SESOI would be d = .07 among .03 and .05, but I don’t see it explained why the former and not one of the latter. This has no pragmatic relevance, but you may wish to elaborate unless I've missed something.
8. Deathmatch AI is used (p. 21). For future work (no sense to make last minute changes here), the Aim Botz mod could provide performance data in a more standardizable setting. Although deathmatch is more organic, scores are influenced by weapon choice, bot behavior, etc. That said, I wonder if it would be possible to further standardize deathmatch conditions, e.g., by fixing participants to one weapon.
9. Related to the above, it might be good to stress that the performance situation is human-AI and not human-human. This may be a highly relevant component in the production of competitive challenge/threat response. I don’t know if the following study ever replicated but see e.g. Kätsyri et al. 2013: https://doi.org/10.1093/cercor/bhs259 (perhaps to be considered more at Stage 2 discussion).
10. Related to the above still, I’m also thinking to what degree gender affects this intervention. Especially in CSGO, competitive women players have always been a small minority, which has affected in how they experience competitive situations (e.g., Balakina et al. 2022: https://doi.org/10.1145/3569219.3569393 ). I believe the support for stereotype threat effects is currently weak at best, but I would consider e.g., including images of and quotes from top women players in the materials for women participants (i.e. have separate sets of materials for men and women). This could be a simple way to improve intervention effectiveness.
11. I ran a face validity check with the materials through a Polish player, and one additional note (see also reviewer feedback) came up: the term “gamer” (graczy) addresses a specific subgroup of players, which has a strong identity connotation in this cultural context and tends to exclude some potential participants (in the same way as “scientists” would likely exclude e.g., philosophers among researchers). It’s totally ok if this is your target group, but just ensuring you’re aware of that (as it would be very easy to use different terminology).
12. Regarding the baseline measures on p. 29 and supplement p. 6, I note the following. #5: considering using “playing” instead of “training”, as people interpret “training” in many ways, e.g., ranked play isn’t training? (especially if this item is used as an inclusion crieterion) #6: Some people have multiple accounts and have played other mods of (almost identical) CS, so maybe provide an option to self-estimate total hours played CS?
13. P. 36 data availability says that all data will be made available, but I assume e.g., all video data will not be made available? Any other exceptions?

I was informed during the process that your reservation for the lab is now in April. Let’s make sure you have a decision before that. I know it's a lot of feedback. I can be contacted directly at any point for all concerns and questions, and if possible, please inform me some days before when you’re about to resubmit so that I can prearrange time to fully prioritize this and provide a rapid turnaround. This is an important study and it’s a privilege to help you with it.

Best wishes,
Veli-Matti Karhulahti

https://doi.org/10.24072/pci.rr.100364.d1

Reviewed by Lee Moore, 23 Jan 2023

Download the review https://doi.org/10.24072/pci.rr.100364.rev11

Reviewed by Ivan Ropovik , 12 Feb 2023

Thanks to the authors for the opportunity to read their manuscript (ms). Overall, I think that the proposed study would be informative. One of its main selling points is that it will produce a rich dataset, making it possible to examine the effect of the designed intervention on the longitudinal trend in performance and affective measures, while not relying entirely on self-reports but also collecting physiological measurements. Thanks to being a RR, the present study has the potential to bring evidence that would likely be much more robust than a modal study published in this field. That said, I have also some critical takes and suggestions for improvement.

An acknowledgment upfront, I am not a social psychologist and don't have much expert knowledge about the substantive aspects targeted by the present study. In my review, I will mainly focus on the measurement, design, and analysis side of things. As my role as a reviewer is mainly to provide critical feedback, I provide it in a form of comments below, not in order by importance but rather chronologically, as I read the paper. I leave it to the authors’ discretion which suggestions they find sensible and choose to incorporate. I hope that the authors find at least some of the suggestions below helpful.

1. In the introduction, I am missing a bit more critical interpretational viewpoint. As usual, in most research studies, the presented past research is all taken at face value. Especially in such literature, where weakly informative designs yielding very heterogeneous findings are rather the norm, I think it makes sense to identify which of the past studies presented in the intro are vitally important for informing the theoretical underpinnings of the present study and qualify the strength of their conclusions by the methodological robustness of the design they utilized.
2. The data will not support very wide generalization, so I would suggest revising the title to sth more specific like Optimizing *Esports* Performance Using a Synergistic Mindsets Intervention. The same with abstract.
2. My hunch is that affective response patterns may be rather stable characteristics that will be difficult to structurally alter with a self-administered one-shot type of intervention. Getting *a* significant effect is not that hard. Finding *the* effect using a rigorous method (even though the intervention seems face valid) is far less likely, IMO. The good thing about the design of the present study is that it attempts to partly stretch the intervention over a week. Kudos to the authors that they choose a RR format to give it a try.
3. “The other participants will be assigned to a validated placebo intervention focused on learning about the brain (Yeager et al., 2022).” Well, this appears to be a stretch too far for me. The cited study did not carry out *any* validation of this control condition. Being used before (even though in Nature paper) is not equal to being validated. I’d suggest removing that remark. I’ll have more comments on the control condition below.
4. Again, rather loose use of a validity claim. “In sum, our study will provide a unique combination of high internal and external validity levels”. I think get what you mean (using a controlled experiment & real-world outcome?), but I am not sure about the “high” part (more on that later – I see issues with the measurement and comparativeness of the controls). Anyway, instead of this slightly hyped-up language, I’d stick to being more descriptive.
5. The targeted population could have been described in more detail. It does not even mention that the sample will be entirely Polish (true?). Fully ok, but these details need to be acknowledged. How are the participants going to be sampled/recruited? Predominantly, what kind of people will they be? Students? General population? Is >6 hours of playing time the only inclusion criterion? Is there an expectation regarding the sample composition?
6. Part Sampling Plan/Expected Effect Sizes, the first paragraph could be structured more clearly. It currently mixes substantive assertions with generic stats meta-talk.
7. I think that the Expected Effect Sizes part is a conceptually relatively weakly informative part of the research design justification. I understand that the authors wanted to provide a solid ground for the sample size determination but I think it misses the point. To elaborate, I see the following. I like the determination of SESOI that is grounded in some reality. That part is fine. But I don’t get the need for “expected ES”. Trying to inform the design of a study based on (probably) non-systematic picking among relatively idiosyncratic, heterogeneous effects from published (thus subject to publication bias) literature is unfruitful, IMO. The meta-analysis by Webb et al. (2012) is, IMO, not helpful too, for that matter (more on that later). Even if there was no bias in the literature, considering “expected ES” is incompatible with the frequentist notion of power, a *pre-data*, theoretical concept (just like α), i.e., a sensitivity of a given statistical test to reliably detect a range of hypothetical population effects of interest (see Morey & Lakens, 2017). Instead of such long, numbers- and references-ladden part, I, as the reader, would prefer to see the power curve (given the specific model) for the hypothetical range of effect sizes. That range would, of course, also include the SESOI. A figure would say more than thousand words, not forcing the reader to appraise the informativeness of the present design at fixed points.
8. The unfruitfulness of the SESOI & Expected ES combo can IMO be seen in the Sample Size Determination part. Absent any formal mechanism (or common conceptual footing) to reconcilliate the two, the authors are pushed to conclude that the SESOI is unfeasible, while the “surprisingly large” ES from the past literature also did not pass some of the internal checks present in a skeptical reasoner. So the outcome of the several paragraphs long justification is an arbitrary set of ESs. There’s inherently nothing wrong with setting an arbitrary target, or one that is doable given some budget. My point is only that looking at a wider hypothetical range would be more informative. That way, the reader would gain a comprehensive outlook of what power does the given design/test provide for any given ES. Btw, I liked the justification for the target ratio of type I/II error rates.
9. It is fine to compute power for individual SEM parameter estimates (not “for the structural equation model” as put down), but I think it always makes sense to report whether the SEModel has decent power to pick up significant model-data discrepancies if these are present. That can be done for an approximate fit hypothesis using the RMSEA (see https://www.quantpsy.org/rmsea/rmsea.htm) and be reported at least in SMs.
10. Re: assuming factor loadings of .50… This is a serious design blunder, IMO. If the employed scales have such an abysmal overrepresentation of error variance (loading of .50 implies 75% of the total variance being error), it has serious consequences for the efficiency, precision, and likely also the accuracy of the design. Yes, in most of the social science research, measurement properties of the measures are hidden away behind convenient sum scores, so I don’t want to scold the authors for paying higher-than-usual attention to some of their auxiliaries. But still, if it is the case, this should be discussed.
11. Just an idea always worth considering, IMO. Maybe it would make sense to try to screen out careless responders (e.g., based on longstring detection or some sort of insufficient variance in responding pattern, or being a multivariate outlier indicating random response pattern). If done, it should only be applied to pre-treatment measures, as carelessness itself may have been affected by the treatment, and exclusions based on that would induce bias.
12. For the description of stages, I found it hard to understand at times, what follows what. E.g., "participants will provide informed consent and fill out baseline questionnaires"... "Next, the researcher will apply sensors to obtain cardiovascular measurements, and participants will fill in the baseline questionnaires”. What comes first? Cardiovascular measurements or questionnaires? The figure with the procedure workflow is clear but the text description was sometimes difficult to follow. Maybe it’s because I’m not acquainted with the subject matter, but it was difficult for me to keep track of what was measured, when, and for what purpose.
13. In general, there is little information on how the measures will be ordered. Within the questionnaire block, why not randomize to minimize the order effects? The same with items within scales. Will their order be randomized?
14. In stage I, is gaming for 2 minutes enough to provide a reliable picture? Just asking.
15. It is fine to have confirmatory RQs, with all other things lumped in exploratory analyses. But the reasoning on p.24 does not make sense to me. This one: “We treat them as secondary because we did not include them in the power analysis, and we may not have enough statistical power to infer about the effects of synergistic mindset intervention on them.”. Why? Power seems to be an independent issue to me.
16. Number of items for RESS-EMA scales not reported. Also, why a different response scale in the baseline and stage III?
17. Several scales are based on only 2 items. I guess also the alpha will be very low.
18. Re data reduction contingencies: I’d say that with factor loading below .40, the modeled latent is at higher risk of being “hijacked” by some idiosyncratic factor (or poor construct validity in general) but that is also the case with factor/component scores or sum scores. The latter two only hide that problem. So although far from ideal, I think it would be reasonable to model the latents as long as the model converges and fall back to some observed scores only if that is not the case.
19. Regarding the use of difference scores, I think it is statistically superior to use residual scores. E.g., regress pre-match baseline score on resting baseline and taking the residual.
20. Re missing data: the plan is to exclude participants with missings on a per-analysis basis. If you use SEM, that makes little sense. Why not use full-information maximum likelihood to impute the missing values? Mplus can do that easily. Deletion means discarding information and is only ok when the data are missing completely at random.
21. Any conceptual or statistical reason to exclude outliers with a such number of observations and type of variables?
22. After controlling for the effect of the intervention, your model implicitly assumes that the covariances between the residuals of mediators are all zero. Is that what you want? If untrue, this represents a model misspecification that will propagate throughout the model and bias the other estimates.
23. RMSEA and CFI are okay approximate fit indices, but why is the plan for model fit evaluation missing the only formal model test, the chi^2 test? I’d definitely want to see that. Maybe also the SRMR.
24. Is it correct that you won’t be interpreting effect sizes if there will be a significant model-data misfit? Btw, ironically, low loadings help lower the chi^2 value.
25. “We do not plan to use the same approach for hypothesis 2 because we did not find a way to operationalize the smallest effect of interest for cardiovascular responses”. Alternatively, it is fairly easy to compute Bayes factors for individual model parameters by a model selection approach, using just the BIC (Bayesian information criterion) approximation – comparing the BIC of models with and without the given parameter (see Wagenmakers, 2007). No SESOI or prior needs to be specified (a weakly informative unit information prior is implicitly assumed) and BIC can easily be extracted for any model. Presenting the continuous BFs alongside equivalence tests may be informative for the readers.
26. “We will test the robustness of our findings by adding to the primary model the moderation of the negative prior mindsets, negative appraisals, and gaming experience.” Robustness? How? The target causal effect is identified regardless. Btw, I think it would make things clearer if you framed your research goals as either testing causal effects (intervention –> mediators, outcome) or examining mechanisms through which the causal effects operate (mediation effects).
27. Exploratory Analyses section: “We treat the moderations as exploratory analysis because the initial studies were inconclusive on whether the prior mindsets moderate the effects of synergistic mindset intervention on cardiovascular and performance outcomes (Yeager et al., 2022).” I see your point but the fact that "initial studies were inconclusive" is irrelevant with respect to the inclusion of a moderator to the model. The thing is, that the inclusion of a moderator and modeling (incoming and outgoing) paths to the treatment and outcome nodes is an act of expressing ignorance about the presence of the given paths. Meaning, there may or may not be an effect. It's fine to choose your confirmatory research aims but justifying it based on the inconclusiveness of prior research appears conceptually weak to me.
28. Also Exploratory Analyses section, “We will also test the robustness of our findings by testing alternative operationalizations of the variables used in the model. For positive/negative affect, we will use the sum of the positive/negative items instead of the latent factor.” The sum score is only a special case of a latent variable model, where you assume equivalence of factor loadings and reliabilities of all measured indicators equal to 1. Therefore, from a measurement perspective, I personally wouldn't qualify the interpretation of the robustness of the conclusions based on employing psychometrically inferior measurement models. Instead, I would only plan using observed sum scores as a fallback plan if they were locally under-identified (say in case of collinearity issues) or producing estimation issues due to the violation of local independence assumption (large residual covariances). That is far from ideal as sum scores in such case hide serious measurement issues but at least provides you with an opportunity to empirically address your target research questions, albeit more tentatively. Btw, if you really were to resort to observed scores as a fallback, I'd use a PCA component score, where you don't assume equal component loadings at least.
29. Third, “For positive/negative affect, we will also try the single difference score (sum of negative emotions subtracted from the sum of positive emotions)”. Do you mean the difference in *mean* scores if they are using the same scale? If there will be some missing data, subtracting sums will not work.
30. I am a fan of testing the robustness of findings by employing alternative operationalizations. But with so many, how are you going to do that specifically? There will be quite a few combinations. Maybe you should consider doing a multiverse analysis for these robustness checks.
31. It is practically not feasible for me to review the measures as only links for items are provided.
31. The last one and important. When I read the control condition instruction, it seems obvious to me that this condition likely doesn’t elicit the same degree of expectancy as the reappraisal manipulation. The control condition needs to seem smart and face valid to the participants, but be inert w.r.t. the outcomes. I encourage the authors to think about how to make the control condition far more believable and applicable because at the moment, any post-treatment difference in the outcome between the experiment groups may be due to a likely substantial difference in the strength of demand characteristics perceived by the participants. Apart from that, it does not help if only “the synergistic mindsets group will report the adherence and progress in scheduled affect regulation training” (got that right?).

Just a few examples. Maybe it's just me but that sounds trivial to me and I definitely wouldn't expect any effect on my esport performance or affect by reading about Phineas Gage and suchlike stuff:

"What I didn't expect, however, was that the brain is so involved when I'm playing! In fact, everything I need to play - seeing the map, hearing my teammates on headphones and speaking to them, moving around the map with the mouse and keyboard, thinking about my next move, anticipating my teammates' and opponent's moves - are all made possible by different areas of my brain!"

“I now know that my eyes are not responsible for my vision and that I know how to go to the local store is due to my temporal lobe. I learned that when I am stressed, my behavior depends on the cooperation of two parts of the nervous system - one responsible for normal functioning and the other responsible for immediate reactions."

"On the other hand, it is sad how much brain damage can impair further functioning. Fortunately, thanks to such injuries, scientists are learning more and more about how the brain works and how to treat various diseases."

To sum up, I think the present proposal has merits and can provide rich data and relatively robust findings. What worries me most is (1) the outlook of poor measurement and (2) pretty obvious differences in demand characteristics of treatment and control conditions. I also feel the study underutilizes the data, where the authors are depriving themselves of the opportunity to examine interesting questions (specifically using the longitudinal measurements, modeling more complex models, and looking at the follow-up). But I am a fan of the principle that authors should be free to study what they want. Anyway, this got way too long (sorry for that) so please, feel free to integrate what you see fit and react only to what requires a reaction.

Good luck with the revision!
Best wishes,

Ivan Ropovik

https://doi.org/10.24072/pci.rr.100364.rev12

Reviewed by Ivana Piterová, 24 Jan 2023

Thank you for the opportunity to review the Mplus scripts for this RR.

Power analysis script: In this script, the semicolon at the end of the line (NAMES =...) is missing, so the code firstly reports an error, but after this correction, the code works well and produces the reported results, so I can confirm the reproducibility of the calculations with this script in Mplus, listed in the supplements.

Primary analysis script: The mediation code is correct and on simulated data produces the expected results without errors or warnings.

https://doi.org/10.24072/pci.rr.100364.rev13

User comments

No user comments yet

or Register
Submit a report