Recommendation

Is the SNARC effect modulated by absolute number magnitude?

Robert McIntosh based on reviews by Melinda Mende and 1 anonymous reviewer

A recommendation of:

STAGE 1

One and only SNARC? A Registered Report on the SNARC Effect’s Range Dependency

Lilly Roth, John Caffier, Ulf-Dietrich Reips, Hans-Christoph Nuerk, Krzysztof Cipora https://osf.io/gr94f version v3

Read report on server

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

One and only SNARC? A Registered Report on the SNARC Effect’s Range Dependency

Numbers are associated with space, but it is unclear how flexible these associations are. In this study, we will investigate whether the SNARC effect (Spatial-Numerical Association of Response Codes; Dehaene et al., 1993), which describes faster responses to small/large number magnitude with the left/right hand, respectively, is fully flexible (depending only on relative magnitude within a stimulus set), or not (depending on absolute magnitude as well). Evidence for relative-magnitude dependency comes from studies observing that numbers 4 and 5 were associated with the right when presented in a 0 – 5 range but with the left in a 4 – 9 range (Dehaene et al., 1993; Fias et al., 1996). However, this important conclusion was drawn solely from the absence of evidence for absolute-magnitude dependency in frequentist analysis in underpowered studies. A closer inspection of those descriptive data suggests absolute magnitude also matters. Hence, we will conduct a close replication of Dehaene et al.’s (1993) Experiment 3 and a conceptual replication considering recent advances in SNARC research, investigating absolute- and relative-magnitude dependency by comparing response patterns to critical numbers, intercepts and SNARC slopes across ranges with Bayesian statistics. To achieve a probability of .90 for detecting moderate evidence (Bayes Factor above 3 for Cohen’s d = 0.15 or below 1/3 for d = 0), we will conduct each experiment online with maximum 800 participants (optional stopping at moderate evidence). We hypothesize that both absolute and relative magnitude influence spatial-numerical associations, suggesting the SNARC effect operates on flexible and absolute number representations simultaneously.

spatial-numerical associations, SNARC effect, mental number line, replication, online experiment

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

سنارك واحد فقط؟ تقرير مسجل عن مدى تبعية تأثير SNARC

ترتبط الأرقام بالمكان، ولكن من غير الواضح مدى مرونة هذه الارتباطات. في هذه الدراسة، سنتحقق مما إذا كان تأثير SNARC (الرابطة المكانية العددية لرموز الاستجابة؛ Dehaene et al., 1993)، الذي يصف الاستجابات الأسرع لحجم الأعداد الصغيرة/الكبيرة باليد اليسرى/اليمنى، على التوالي، مرنًا تمامًا (يعتمد فقط على الحجم النسبي ضمن مجموعة التحفيز)، أو لا (يعتمد على الحجم المطلق أيضًا). يأتي الدليل على التبعية ذات الحجم النسبي من الدراسات التي لاحظت أن الرقمين 4 و5 كانا مرتبطين باليمين عند تقديمهما في نطاق من 0 إلى 5 ولكن مع اليسار في نطاق من 4 إلى 9 (Dehaene et al., 1993; Fias et al. ، 1996). ومع ذلك، تم استخلاص هذا الاستنتاج المهم فقط من غياب الأدلة على التبعية المطلقة في التحليل المتكرر في الدراسات الضعيفة. ويشير الفحص الدقيق لتلك البيانات الوصفية إلى أن الحجم المطلق مهم أيضًا. ومن ثم، سنجري تكرارًا وثيقًا لتجربة Dehaene et al. (1993) 3 وتكرارًا مفاهيميًا مع الأخذ في الاعتبار التطورات الحديثة في أبحاث SNARC، والتحقيق في التبعية المطلقة والنسبية من خلال مقارنة أنماط الاستجابة بالأرقام الحرجة والاعتراضات وSNARC المنحدرات عبر النطاقات مع إحصائيات بايزي. لتحقيق احتمال 0.90 لاكتشاف الأدلة المعتدلة (عامل بايز أعلى من 3 لـ Cohen's d = 0.15 أو أقل من 1/3 لـ d = 0)، سنجري كل تجربة عبر الإنترنت بحد أقصى 800 مشارك (التوقف الاختياري عند الأدلة المعتدلة). نحن نفترض أن كلا من الحجم المطلق والنسبي يؤثران على الارتباطات العددية المكانية، مما يشير إلى أن تأثير SNARC يعمل على تمثيلات الأعداد المرنة والمطلقة في وقت واحد.

الارتباطات العددية المكانية، تأثير SNARC، خط الأعداد الذهنية، التكرار، التجربة عبر الإنترنت

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

¿Uno y único SNARC? Un informe registrado sobre la dependencia del alcance del efecto SNARC

Los números están asociados con el espacio, pero no está claro cuán flexibles son estas asociaciones. En este estudio, investigaremos si el efecto SNARC (Asociación Espacial-Numérica de Códigos de Respuesta; Dehaene et al., 1993), que describe respuestas más rápidas a magnitudes numéricas pequeñas/grandes con la mano izquierda/derecha, respectivamente, es completamente flexible. (dependiendo sólo de la magnitud relativa dentro de un conjunto de estímulos), o no (dependiendo también de la magnitud absoluta). La evidencia de la dependencia de magnitud relativa proviene de estudios que observaron que los números 4 y 5 se asociaban con la derecha cuando se presentaban en un rango de 0 a 5, pero con la izquierda en un rango de 4 a 9 (Dehaene et al., 1993; Fias et al. , 1996). Sin embargo, esta importante conclusión se extrajo únicamente de la ausencia de evidencia de dependencia de magnitud absoluta en el análisis frecuentista en estudios con poco poder estadístico. Una inspección más cercana de esos datos descriptivos sugiere que la magnitud absoluta también importa. Por lo tanto, llevaremos a cabo una replicación cercana del Experimento 3 de Dehaene et al. (1993) y una replicación conceptual considerando los avances recientes en la investigación SNARC, investigando la dependencia de magnitud absoluta y relativa comparando patrones de respuesta con números críticos, intersecciones y SNARC. pendientes a través de rangos con estadísticas bayesianas. Para lograr una probabilidad de 0,90 para detectar evidencia moderada (factor de Bayes superior a 3 para d de Cohen = 0,15 o inferior a 1/3 para d = 0), realizaremos cada experimento en línea con un máximo de 800 participantes (deteniéndose opcionalmente en evidencia moderada). Nuestra hipótesis es que tanto la magnitud absoluta como la relativa influyen en las asociaciones espacial-numéricas, lo que sugiere que el efecto SNARC opera simultáneamente en representaciones de números flexibles y absolutos.

asociaciones numéricas espaciales, efecto SNARC, recta numérica mental, replicación, experimento en línea

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Un et unique SNARC ? Un rapport enregistré sur la dépendance à la portée de l'effet SNARC

Les nombres sont associés à l'espace, mais la flexibilité de ces associations n'est pas claire. Dans cette étude, nous examinerons si l'effet SNARC (Spatial-Numerical Association of Response Codes ; Dehaene et al., 1993), qui décrit des réponses plus rapides à des grandeurs de petits/grands nombres avec la main gauche/droite, respectivement, est totalement flexible. (en fonction uniquement de l'ampleur relative au sein d'un ensemble de stimulus), ou non (en fonction également de l'ampleur absolue). Les preuves d’une dépendance en ampleur relative proviennent d’études observant que les nombres 4 et 5 étaient associés à la droite lorsqu’ils étaient présentés dans une fourchette de 0 à 5, mais à la gauche dans une fourchette de 4 à 9 (Dehaene et al., 1993 ; Fias et al. , 1996). Cependant, cette conclusion importante a été tirée uniquement de l’absence de preuves d’une dépendance d’ampleur absolue dans l’analyse fréquentiste des études de faible puissance. Un examen plus approfondi de ces données descriptives suggère que l’ampleur absolue compte également. Par conséquent, nous procéderons à une réplication fidèle de l'expérience 3 de Dehaene et al. pentes à travers les plages avec des statistiques bayésiennes. Pour atteindre une probabilité de 0,90 de détection de preuves modérées (facteur de Bayes supérieur à 3 pour le d de Cohen = 0,15 ou inférieur à 1/3 pour d = 0), nous mènerons chaque expérience en ligne avec un maximum de 800 participants (arrêt facultatif à des preuves modérées). Nous émettons l'hypothèse que les amplitudes absolue et relative influencent les associations spatio-numériques, ce qui suggère que l'effet SNARC opère simultanément sur les représentations de nombres flexibles et absolues.

associations spatio-numériques, effet SNARC, droite numérique mentale, réplication, expérience en ligne

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

एक और केवल SNARC? एसएनएआरसी प्रभाव की सीमा निर्भरता पर एक पंजीकृत रिपोर्ट

संख्याएं अंतरिक्ष से जुड़ी हैं, लेकिन यह स्पष्ट नहीं है कि ये जुड़ाव कितने लचीले हैं। इस अध्ययन में, हम जांच करेंगे कि क्या एसएनएआरसी प्रभाव (स्पैटियल-न्यूमेरिकल एसोसिएशन ऑफ रिस्पॉन्स कोड्स; डेहेन एट अल।, 1993), जो क्रमशः बाएं/दाएं हाथ से छोटी/बड़ी संख्या के परिमाण में तेज प्रतिक्रियाओं का वर्णन करता है, पूरी तरह से लचीला है। (केवल उत्तेजना सेट के भीतर सापेक्ष परिमाण पर निर्भर करता है), या नहीं (पूर्ण परिमाण पर भी निर्भर करता है)। सापेक्ष-परिमाण निर्भरता का प्रमाण उन अध्ययनों से मिलता है, जिनमें देखा गया है कि संख्या 4 और 5 जब 0-5 रेंज में प्रस्तुत किए गए तो दाईं ओर से जुड़े थे, लेकिन 4-9 रेंज में बाईं ओर से जुड़े थे (डेहेन एट अल., 1993; फियास एट अल। , 1996). हालाँकि, यह महत्वपूर्ण निष्कर्ष पूरी तरह से अल्पशक्ति वाले अध्ययनों में बारंबारतावादी विश्लेषण में पूर्ण-परिमाण निर्भरता के साक्ष्य की अनुपस्थिति से निकाला गया था। उन वर्णनात्मक डेटा का बारीकी से निरीक्षण करने से पता चलता है कि पूर्ण परिमाण भी मायने रखता है। इसलिए, हम डेहेन एट अल (1993) प्रयोग 3 की एक करीबी प्रतिकृति और एसएनएआरसी अनुसंधान में हाल की प्रगति पर विचार करते हुए एक वैचारिक प्रतिकृति का संचालन करेंगे, महत्वपूर्ण संख्याओं, अवरोधों और एसएनएआरसी के प्रतिक्रिया पैटर्न की तुलना करके पूर्ण और सापेक्ष-परिमाण निर्भरता की जांच करेंगे। बायेसियन आँकड़ों के साथ श्रेणियों में ढलान। मध्यम साक्ष्य का पता लगाने के लिए .90 की संभावना प्राप्त करने के लिए (कोहेन के डी = 0.15 के लिए बेयस फैक्टर 3 से ऊपर या डी = 0 के लिए 1/3 से नीचे), हम प्रत्येक प्रयोग को अधिकतम 800 प्रतिभागियों के साथ ऑनलाइन आयोजित करेंगे (मध्यम साक्ष्य पर वैकल्पिक रोक)। हम परिकल्पना करते हैं कि पूर्ण और सापेक्ष परिमाण दोनों स्थानिक-संख्यात्मक संघों को प्रभावित करते हैं, यह सुझाव देते हुए कि एसएनएआरसी प्रभाव लचीले और पूर्ण संख्या प्रतिनिधित्व पर एक साथ काम करता है।

स्थानिक-संख्यात्मक संघ, एसएनएआरसी प्रभाव, मानसिक संख्या रेखा, प्रतिकृति, ऑनलाइन प्रयोग

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

唯一無二のSNARC? SNARC 効果の範囲依存性に関する登録済みレポート

数字は空間に関連付けられますが、これらの関連付けがどの程度柔軟であるかは不明です。この研究では、SNARC 効果 (応答コードの空間数値関連、Dehaene et al.、1993) が、それぞれ左手/右手で小さい/大きい数の大きさに対してより速く応答することを説明するが、完全に柔軟であるかどうかを調査します。（刺激セット内の相対的な大きさのみに依存する）、またはそうでない（絶対的な大きさにも依存する）。相対的な大きさの依存性の証拠は、数字 4 と 5 が 0 ～ 5 の範囲で提示された場合は右側と関連付けられるが、4 ～ 9 の範囲で提示された場合は左側と関連付けられることを観察した研究から得られます (Dehaene et al., 1993; Fias et al. 、1996）。しかし、この重要な結論は、検出力の低い研究での頻度主義的分析における絶対振幅依存性の証拠が存在しないことのみから導かれました。これらの記述データを詳しく調べると、絶対的な大きさも重要であることがわかります。したがって、我々は、Dehaene et al. (1993) の実験 3 の厳密な再現と、SNARC 研究の最近の進歩を考慮した概念的な再現を実施し、応答パターンを臨界数、切片、および SNARC と比較することによって絶対および相対振幅の依存性を調査します。ベイズ統計を使用した範囲全体の傾き。中程度の証拠 (コーエンの d = 0.15 のベイズ係数が 3 を超えるか、d = 0 の場合 1/3 未満) を検出する確率 0.90 を達成するために、最大 800 人の参加者を対象に各実験をオンラインで実施します (中程度の証拠で停止するオプション)。私たちは、絶対的大きさと相対的大きさの両方が空間と数値の関連に影響を与えると仮説を立てており、SNARC 効果が柔軟な数値表現と絶対的な数値表現に同時に作用することを示唆しています。

空間数値関連、SNARC効果、精神数直線、複製、オンライン実験

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Um e único SNARC? Um relatório registrado sobre a dependência do alcance do efeito SNARC

Os números estão associados ao espaço, mas não está claro até que ponto essas associações são flexíveis. Neste estudo, investigaremos se o efeito SNARC (Associação Espacial-Numérica de Códigos de Resposta; Dehaene et al., 1993), que descreve respostas mais rápidas para magnitudes numéricas pequenas/grandes com a mão esquerda/direita, respectivamente, é totalmente flexível (dependendo apenas da magnitude relativa dentro de um conjunto de estímulos), ou não (dependendo também da magnitude absoluta). A evidência da dependência da magnitude relativa vem de estudos que observaram que os números 4 e 5 estavam associados à direita quando apresentados num intervalo de 0 a 5, mas com a esquerda num intervalo de 4 a 9 (Dehaene et al., 1993; Fias et al. , 1996). No entanto, esta importante conclusão foi tirada apenas da ausência de evidências de dependência de magnitude absoluta na análise freqüentista em estudos de baixo poder. Uma inspeção mais detalhada desses dados descritivos sugere que a magnitude absoluta também é importante. Portanto, conduziremos uma replicação próxima do Experimento 3 de Dehaene et al. (1993) e uma replicação conceitual considerando avanços recentes na pesquisa SNARC, investigando a dependência de magnitude absoluta e relativa comparando padrões de resposta com números críticos, interceptações e SNARC. inclinações em intervalos com estatísticas bayesianas. Para atingir uma probabilidade de 0,90 para detectar evidências moderadas (fator de Bayes acima de 3 para d = 0,15 de Cohen ou abaixo de 1/3 para d = 0), conduziremos cada experimento on-line com no máximo 800 participantes (parada opcional em evidências moderadas). Nossa hipótese é que tanto a magnitude absoluta quanto a relativa influenciam as associações numéricas espaciais, sugerindo que o efeito SNARC opera em representações de números absolutos e flexíveis simultaneamente.

associações numéricas espaciais, efeito SNARC, reta numérica mental, replicação, experimento online

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Один и единственный SNARC? Зарегистрированный отчет о зависимости эффекта SNARC от диапазона

Числа связаны с пространством, но неясно, насколько гибкими являются эти ассоциации. В этом исследовании мы исследуем, является ли эффект SNARC (Пространственно-числовая ассоциация кодов ответа; Dehaene et al., 1993), который описывает более быстрые реакции на малую/большую величину числа левой/правой рукой соответственно, полностью гибким. (в зависимости только от относительной величины в наборе стимулов) или нет (в зависимости также от абсолютной величины). Доказательства зависимости от относительной величины получены в исследованиях, в которых наблюдалось, что числа 4 и 5 были связаны с правым, когда они представлены в диапазоне 0–5, и с левыми в диапазоне 4–9 (Dehaene et al., 1993; Fias et al. , 1996). Однако этот важный вывод был сделан исключительно из-за отсутствия доказательств зависимости абсолютной величины при частотном анализе в исследованиях с недостаточной статистикой. Более внимательное изучение этих описательных данных показывает, что абсолютная величина также имеет значение. Следовательно, мы проведем точное повторение эксперимента 3 Деэна и др. (1993) и концептуальное повторение, учитывая недавние достижения в исследованиях SNARC, исследуя зависимость абсолютной и относительной величины путем сравнения моделей ответа с критическими числами, перехватами и SNARC. наклоны по диапазонам с байесовской статистикой. Чтобы достичь вероятности 0,90 для обнаружения умеренных доказательств (фактор Байеса выше 3 для d Коэна = 0,15 или ниже 1/3 для d = 0), мы будем проводить каждый эксперимент онлайн с максимальным количеством участников 800 (дополнительная остановка при умеренных доказательствах). Мы предполагаем, что как абсолютная, так и относительная величина влияют на пространственно-числовые ассоциации, предполагая, что эффект SNARC действует одновременно на гибкие и абсолютные представления чисел.

пространственно-числовые ассоциации, эффект SNARC, мысленная числовая линия, репликация, онлайн-эксперимент

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

唯一的 SNARC？关于 SNARC 效应范围依赖性的注册报告

数字与空间相关联，但尚不清楚这些关联有多灵活。在本研究中，我们将研究 SNARC 效应（响应代码的空间数字关联；Dehaene 等人，1993）是否完全灵活，该效应分别描述了左手/右手对小/大数值的更快响应（仅取决于刺激集中的相对幅度），或不（也取决于绝对幅度）。相对大小依赖性的证据来自研究，观察到数字 4 和 5 在 0 – 5 范围内与右侧相关，而在 4 – 9 范围内则与左侧相关（Dehaene 等人，1993 年；Fias 等人，1993 年）。，1996）。然而，这一重要结论仅是由于在动力不足的研究中缺乏频率分析中绝对幅度依赖性的证据而得出的。对这些描述性数据的仔细检查表明绝对大小也很重要。因此，我们将进行 Dehaene 等人 (1993) 实验 3 的精密复制，并考虑 SNARC 研究的最新进展进行概念复制，通过将响应模式与临界数字、截距和 SNARC 进行比较来研究绝对和相对幅度依赖性贝叶斯统计的跨范围斜率。为了实现检测中等证据的概率为 0.90（科恩 d = 0.15 的贝叶斯因子高于 3 或 d = 0 的贝叶斯因子低于 1/3），我们将在最多 800 名参与者的情况下在线进行每个实验（可选择在中等证据处停止）。我们假设绝对和相对幅度都会影响空间数字关联，这表明 SNARC 效应同时作用于灵活和绝对数字表示。

空间数字关联、SNARC 效应、心理数轴、复制、在线实验

Submission: posted 30 November 2022
Recommendation: posted 28 November 2023, validated 28 November 2023

Cite this recommendation as:
McIntosh, R. (2023) Is the SNARC effect modulated by absolute number magnitude?. Peer Community in Registered Reports, . https://rr.peercommunityin.org/articles/rec?id=352

Related stage 2 preprints:

One and only SNARC? Spatial-Numerical Associations are not fully flexible and depend on both relative and absolute magnitude
Lilly Roth, John Caffier, Ulf-Dietrich Reips, Hans-Christoph Nuerk, Annika Tave Overlander, Krzysztof Cipora
https://osf.io/ajqpk

Recommendation

The Spatial-Numerical Association of Response Codes (SNARC) effect refers to the fact that smaller numbers receive faster responses with the left hand, and larger numbers with the right hand (Dehaene et al., 1993). This robust finding implies that numbers are associated with space, being represented on a mental number line that progresses from left to right. The SNARC effect is held to depend on relative number magnitude, with the mental number line dynamically adjusting to the numerical range used in a given context. This characterisation is based on significant effects of relative number magnitude, with no significant influence of absolute number magnitude. However, a failure to reject the null hypothesis, within the standard frequentist statistical framework, is not firm evidence for the absence of an effect. In this Stage 1 Registered Report, Roth and colleagues (2023) propose two experiments adapted from Dahaene’s (1993) original methods, with a Bayesian statistical approach to confirm—or rule out—a small effect (d = 0.15) of absolute number magnitude in modulating the classic SNARC effect.

The study plan was refined across two rounds of review, with input from two external reviewers and the recommender, after which it was judged to satisfy the Stage 1 criteria for in-principle acceptance (IPA).

URL to the preregistered Stage 1 protocol: https://osf.io/ae2c8

Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.

List of eligible PCI RR-friendly journals:

References

Dehaene, S., Bossini, S., & Giraux, P. (1993). The mental representation of parity and number magnitude. Journal of Experimental Psychology: General, 122(3), 371–396. https://doi.org/10.1037/0096-3445.122.3.371

Roth, L., Caffier, J., Reips, U.-D., Nuerk, H.-C., & Cipora, K. (2023). One and only SNARC? A Registered Report on the SNARC Effect’s Range Dependency. In principle acceptance of Version 3 by Peer Community in Registered Reports. https://osf.io/ae2c8

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Reviews

Evaluation round #2

DOI or URL of the report: https://osf.io/rfsah

Version of the report: v2

Author's Reply, 22 Nov 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100352.ar2

Decision by Robert McIntosh, posted 08 Sep 2023, validated 08 Sep 2023

Thank you for your careful work addressing review comments for this Stage 1 RR. The revisions have been evaluated by one of the original reviewers, who is happy with the changes. (The other original reviewer was not available at this time.)

I have looked over the paper myself, and think it is considerably improved, but there are a few more issues that I would like you to give attention to before IPA is issued for this experiment. You are not obliged to follow any suggestions made, but should provide a rationale for the course of action that you decide.

1) "Power analysis". In developing you study plan, you present both frequentist power analyses (for prior studies) and power analyses developed within the Bayesian framework for your planned study. Given that you seem to be applying a criterion threshold BF (>3) to support a binary claim, it may be legitimate to talk about 'power', but I think it is nonetheless potentially confusing for you to use the same language of 'power' to apply both to frequentist and Bayesian appoaches.

One key reason is that power concerns only the probability to detect true effects (of a given size) when present. But your Bayesian analysis is not only asking this question, but is configured also to return evidence in favour of the null hypothesis (BF < 1/3). Thus, you state in the abstract that "... a power of .90 for detecting moderate evidence (Bayes Factor above 3 or below 1/3)", but actually your sample size planning seems to be predicated only on sensitivity to a true effect when present, without considering your sensitivity to the null when the null is true. A small tweak of your code would allow you to make more complete statements about the probability of your Bayesian analysis to return sensitive evidence for H1 or H0, and the rates of misleading evidence (using the language of Bayes Factor Design Analysis).

2) At present, your sample size is predicated on the smallest effect size of interest for H3, and the tests of H1 and H2 inherit their sensitivity from this design. Strictly speaking, this means that your experimental design has not been shown to be adequate to test H1 and H2, whereas the RR format required that you demonstrate your required standard of evidence for all hypotheses. Given that your SESOI for H3 is so small, it would seem to be a small extra step for you to make the (easy) argument that an effect size smaller than this would be similarly uninteresting for H1 and H2, which would then allow you to assert the required level of sensitivity for all hypotheses.

3) However, for your experiment to really have the required level of sensitivity for all hypotheses, then your stopping rule cannot be based on a sensitive outcome for one hypothesis only - you could only stop the experiment (prior to n-max) if a sensitive BF were found for all three hypotheses. If your plan is to terminate the experiment based on H3 only, then your plan does not have the desired level of sensitivity for H1 and H2, and you would need to relegate these hypotheses to secondary, exploratory status (i.e. remove them from the Stage 1 plan).

4) You have now added the Odd Effect as a positive control/manipulation check. However, your logical chain here simply states that it is a robust effect in partity judgements, and that you expect to find it and will be surprised if you find evidence against it (you do not state what will happen if the BF is insensitive). This does not constitute a meaningful manipulation check, because it does not seem to have any implications for your main hypothesis tests. Normally, a manipulation check is an effect that should definitely be present in the data so that, if it is not found, there is evidence that your task has not worked as intended. Normally, when a manipulation check is failed, then the conclusion is that the experiment is deemed incapable of returning a clear anwer on the experimental hypotheses. For this reason, the manipulation check is normally first in the list of inferential tests, to establish the adequacy of the task to the question. Like other inferential tests, it requires a power/sensitivity analysis. If the Odd Effect has this status, then you need to make this clear. If it does not, then it is not a manipulation check.

5) On the other hand, your H1 is a check that the SNARC effect is present in all number ranges. This seems much more to me like a conventional (and relevant) manipulation check, and yet you simply state that you will be surprised if you don't find it in all ranges, but do not indicate that this would limit your ability to test further experimental hypotheses in any way. I would at least have thought that finding the SNARC effect in a given range was a requirement for testing the other hypotheses with respect to that range (i.e. any tests in which that range is involved for H2 or H3). If this is not the case, then it would seem that your experiment has been configured such that you could be making claims about the range dependency of the SNARC effect even if your data showed no evidence of SNARC effects per se. I realise that this outcome is unlikely, but it is the logical coherence of your analysis plan that is at stake.

6) The manuscript is long and complex. There are no word limits at PCI-RR, but with the aim of future publication, I would strongly encourage you to try to be more concise wherever possible (obviously, without omitting any essential material).

7) The new title is rather unwieldy: "“One and only SNARC? How Flexible are The Flexibility of Spatial-Numerical Associations? A Registered Report on the SNARC’s Range Dependency”.

Should the first part be: "One and only one SNARC?"? The second and third parts seem to be two alternate sub-titles; there should probably be one and only one sub-title.

I hope that these comments are helpful. You have done a lot of good work to sharpen this plan up, but I think that a little further sharpening and streamlining is required before IPA.

Best wishes,

Rob McIntosh

https://doi.org/10.24072/pci.rr.100352.d2

Reviewed by anonymous reviewer 1, 27 Aug 2023

The authors have carefully revised their report. I highly appreciate their efforts and I am fully satisfied with how they addressed my concerns and incorporated my suggestions.

https://doi.org/10.24072/pci.rr.100352.rev21

Evaluation round #1

DOI or URL of the report: https://osf.io/heq6p

Version of the report: v1

Author's Reply, 25 Jul 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100352.ar1

Decision by Robert McIntosh, posted 04 Apr 2023, validated 04 Apr 2023

Thank you for your patience, and I apologise that it has taken so long to return reviews for your manuscript. It proved tricky to find reviewers, but once suitable and willing people were found, the reviews were thorough, and I think you will find them very helpful in guiding a revision of this Stage 1 plan. Reviewer#2 in particular is very familiar with the logic and rigours of Registered Reports, and provides excellent guidance on related considerations.

Both reviewers are generally positive about the proposed study, but have substantive concerns and suggestions for improvement. You should consider (and respond to) all of these points carefully. I would emphasise the following in particular, adding some comments of my own.

Reviewer#1 (Melina Mende) makes a number of requests for clarity, and emphasises the need for a full justification for the approach to data trimming. The approach to treatment of Reaction Times is absolutely critical (since it is the basis for the core dependent measure). It is described in detail, but it is not rationalised. The treatment of RT is a complex issue, and decisions about whether (and how) to exclude outliers and/or to transform data and/or to use robust estimators of central tendency per cell (i.e. medians) are complex and ideally should be informed by a good working knowledge of the characteristics of the data in your experiment. (In general, of course, pre-registered approaches may tend towards more conservative and robust approaches, because the final form of the data cannot be known in advance.)

On this point, although this Stage 1 RR seems to have been thought through carefully, I do not see direct evidence of the tasks having been fully piloted, where pilot data would allow for the full piloting of the proposed analysis pipeline. It may be that you have done such piloting, or perhaps you have used the same data collection approach/platform in a previous study, so your analysis pipeline is well established. If so, then you should describe this relevant history in the present RR. If not, then I honestly think it is necessary to conduct a reasonably-sized pilot in order to debug and optimise your analysis plan.

This relates also to the points made by Reviewer#2 regarding your quality checks (and confidence in the is capable of testing the hypotheses of interest seem essential here.

I also agree strongly with this reviewer that the purpose and inferential role of all parts of the analysis must be clear (e.g. how will the outcome of the dropout analysis be used to inform interpretation of findings), and that the exploratory analyses should be removed from the Stage 1 plan. The ‘follow-up’ analyses should be elevated to full inferential status and specified fully or, if not essential to the main conclusions, relegated to exploratory status and omitted (the latter approach may be preferred as simpler, given that your analysis plan is already rather complex).

I also concur with the idea that combining frequentist and Bayesian approaches seems unnecessarily complex and ambiguous. If these approaches do not lead to the same outcomes then which approach will you be guided by? (And then why should you bother to include the other approach at all?) It is of course possible to include parallel frequentist and Bayesian analyses in an RR, but specifying unambiguously which theoretical conclusions will follow for the full range of possible outcomes becomes very complex.

With regard to the frequentist analysis, I have some concerns about the approach to apha levels and (non-)adjustment for multiple comparisons. In the text you state: “For each test described below, a significance level of α = .01 will be used. The reason for using a rather conservative significance level is that we will conduct multiple tests per hypothesis… Importantly, the significance level does not need to be corrected for the total number of conducted tests in this study, because the tests belong to different test families and because different theoretical inferences can be drawn from their results (Lakens, 2016). Moreover, we will look at each result individually and not generalize from one single significant result within a test family to the presence of an effect in both experiments and in all possible number ranges, so that our interpretations will not inflate the familywise error rate.”

This sounds very thorough, but I am not sure it is sound/coherent. First you state that you adopt a conservative significance level so that you don’t have to adjust for multiple comparisons – it would be more transparent to state what the significance criterion is, and how it has been adjusted for (how many) comparisons. Without this, it is unclear what your effective significance threshold is. In apparent contradiction to the above you then go on to state that the threshold does not need to be adjusted because the individual tests are all testing independent hypotheses, and you will interpret each individually. This logic is repeated in the design table.

Although this approach may seem appealing, I am not sure that it is convincing in the present case. As far as I am aware, there is no theory proposing that functionally independent SNARC effects may exist for your different number ranges. In any case, you also state that “not finding it [the SNARC effect] in one of the four ranges despite our large sample would speak against the robustness of the SNARC effect”. This means that the results are not really being evaluated independently for each number range, but considered together to bear on the same theoretical question. Moreover, it is not convincing to state that the failure to find the result in one of the four ranges would speak against the robustness of the SNARC effect, because 90% power implies a 10% chance of a false negative in any one range (~40% chance of at least one false negative result).

In general, I think that your statistical approach needs better justification and specification, and that it might benefit from simplification (e.g. by deciding on either a frequentist or Bayesian approach). In passing, I note that you refer to another paper (Roth, Lukács, et al., 2022) for your power calculations (which, confusingly, seems to be an earlier version of this same RR plan). In any case, power calculations are an integral part of a Stage 1 RR plan and so should be described fully in the RR itself.

I hope that the reviewers' helpful comments will be useful to you in taking this project forward, and if you decide to revise this Stage 1 RR, then you should indicate how you have responded to each of the comments made, including the additional ones above.

https://doi.org/10.24072/pci.rr.100352.d1

Reviewed by Melinda Mende, 04 Feb 2023

The article is well-written and targets an interesting and relevant issue. The researchers aim to investigate RMdependency and AMdependency of the SNARC effect. I think that this work is a positive example of a well-designed study where a lot of considerations were made, starting with the optimal number of trials per cell and power calculations. Further, also the planned data analysis is well described with useful measurements to improve the quality of the statistical analysis. Overall, I have just some minor suggestions to further improve this work.

p.7

“In that study, the observed result pattern looked like Scenario 5 in Figure 1”.

Figure 1 is too far away from this claim. I suggest either introducing the figure earlier or not referring to it at this stage.

p.7

The content of footnote 1 would be better in the main text together with the previous explanation on how to calculate the SNRAC effect.

p. 10/11

Scenarios are very hard to understand, even though they are illustrated in Figure 1. One needs to scroll up and down a lot. Maybe you could divide the figure into parts and explain each of the scenarios and then directly show the figure.

p. 12

“namely 0 to 5 and 4 to 9 in Experiment 1, and 1 to 5 (excluding 3) and 4 to 8 (excluding 6) in Experiment 2”

Which study are you referring to?

p. 16

I do understand your design approach and I think that the two experiments are well elaborated. Nonetheless, I do not understand the content of Table 2. What do you mean by “Parity +0.5”/”Parity -0.5”?

p.17

Why not visualize the time course of the stimuli presentation with a Figure?

p. 18

“This figure shows the four between-subjects conditions”

Why don’t you want to use a fully within-subject design?

p. 18

“handedness, and finger-counting habits”

How will these be measured?

p. 18

“Participants may choose response keys for the experimental task which are to be located in the same row and about one hand width apart from each other on their keyboard”

Even if the keys are one hand width apart from each other, how do you make sure that participants do not use just one hand for giving their responses?

p. 19

“Only trials with RTs between 200 and 1500 ms will be included in the analysis. Further outliers will be removed in an iterative trimming procedure for each participant separately, such that only RTs that are maximum 3 SDs above or below the individual mean RT of all remaining trials will be considered. Finally, only datasets of participants with at least 75% valid remaining trials and without any empty experimental cell (number magnitude per response side) in both number ranges will be considered.”

Please specify how and why you selected these criteria. Such data trimming criteria are often similar in the literature but not entirely equal. Thus, I would like to learn about your justification for using these criteria.

Download the review https://doi.org/10.24072/pci.rr.100352.rev11

Reviewed by anonymous reviewer 1, 02 Apr 2023

The authors of the present Stage 1 Registered Report aim to investigate flexibility of Spatial-Numerical associations by means of two experiments, one being a close replication and one a conceptual replication of previous studies in the field. I think the topic is highly timely, given the accumulating evidence on SNA and its implications. Overall, the authors have obviously taken great care in reviewing the existing literature and in assessing the current methodological limitations. However, the implementation of this study as a registered-report is still suboptimal, my main concerns are outlined below.

The existing section “How could absolute magnitude affect…” left me wondering whether this is all really necessary, or whether this part could be shortened focusing on Table 1 which seems the one that readers can easily link to the rest of the manuscript.
Statistical power. The authors opted for d=0.2 as minimal effect size of interest and explained how estimating this effect based on the existing literature could be biased by various factors. As a reference, it would be anyway useful to report what the typical effect size in this literature is. What was the original effect size in the studies they aim to replicate? Also, when reporting the a priori calculation, the authors refer to a specific standard deviation but do not report any value - please add.
Participants. The authors report only a minimal age (18) as a requirement. Since the experiments measure reaction times, for which age differences might exist, wouldn’t it be more sensible to also add a maximal age? What’s the aim of giving not just full but also partial compensation?
Procedure. The experiment will be implemented in the Wextor online platform. Since the expected effects are very small, do the authors have any information regarding the measurement accuracy of this tool (e.g., compared to lab-based experiments?)
Quality check. The authors report a seriousness check that will be used prior to the beginning of the procedure, and a self-assessment to be filled right afterwards (e.g., the participants will rate the condition in which the experiment took place etc). However, they do not report any concrete quality check to assess correct implementation of the procedure and participant’s compliance with instructions. In line with this, the authors do not appear to have implemented any positive control. These aspects need to be carefully addressed for any registered report, especially in online procedures.
Demographic questions. What’s the rationale behind allowing the “I prefer not to answer” option? It seems rather essential to collect complete answers from all participants. Also, why explicitly use the term “finger counting habit” in these questions? This might seem rather obscure to the participants.
Response keys. The phrasing here is rather obscure to me. Why allow the participants to use any other key than the two that were assigned by default? Especially if no check is put in place, e.g. how will they check that the distance between the two keys is optimal?
Dropout rates. The authors plan to further investigate the reasons for dropouts, but it’s unclear how the results of this analysis will affect subsequent analyses (e.g., in case they show significantly different dropout rates in some conditions?)
Statistical approach. The authors aim to combine null-hypothesis testing with estimation of bayes factors. I assume the former was chosen because of earlier studies, however since the main analyses will employ t-tests why not opt directly for a full bayesian approach? Combining the two approaches always appears rather complex to manage in a registered report - especially when outline the interpretations based on different outcomes. A plus of opting for a bayesian approach is that it would allow the authors to use sequential analyses for a more efficient recruitment and sampling plan (e.g., Schönbrodt, F.D., Wagenmakers, EJ. Bayes factor design analysis: Planning for compelling evidence. Psychon Bull Rev 25, 128–142 (2018))
Analyses. The authors report three types of analyses: main, follow-up and exploratory. However, by definition a Stage 1 submission that will later become a pre-registration, cannot include exploratory analyses. While they could be generically referred to in the analysis plan, they cannot be outlined in detail and included in the design planner - otherwise they’d be pre-registered as well. I’m more uncertain regarding the follow-up analyses, which have a more nuanced status - I invite the authors to reconsider whether these analyses should be pre-registered or not.

https://doi.org/10.24072/pci.rr.100352.rev12

User comments

No user comments yet

or Register
Submit a report