Recommendation

How well do "non-WEIRD" participants in multi-lab studies represent their local population?

Yuki Yamada based on reviews by Zoltan Dienes, Patrick Forscher and Kai Hiraishi

A recommendation of:

STAGE 1

The WEIRD problem in a “non-WEIRD” context: A meta-research on the representativeness of human subjects in Chinese psychological research

YUE Lei, ZUO Xi-Nian, HU Chuan-Peng https://osf.io/dkmt3/ version 7

Read report on server

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

The WEIRD problem in a “non-WEIRD” context: A meta-research on the representativeness of human subjects in Chinese psychological research

 Psychological science aims at understanding human mind and behavior, but it primarily relies on subjects from Western, Educated, Industrialized, Rich, and Democratic regions, i.e., the WEIRD problem. This lack of diversity and representativeness of subjects compromised the generalizability of psychological science. To address this issue, large-scale international collaborative projects were initiated, and more data are collected from non-WEIRD regions. However, it is unknown whether subjects from “non-WEIRD” regions can represent their local population. In this meta-research, we plan to survey the characteristics of Chinese subjects reported in empirical studies published in five mainstream Chinese psychological journals and in large-scale international collaborations. The results will provide a realistic picture of Chinese participants in psychology, and we will discuss potential solutions to the issue of representativeness in both China and worldwide.

meta-research, population psychology, representativeness, WEIRD, generalizability

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

مشكلة WEIRD في سياق "غير WEIRD": بحث تلوي حول تمثيل الأشخاص في الأبحاث النفسية الصينية

 يهدف علم النفس إلى فهم العقل والسلوك البشري، ولكنه يعتمد في المقام الأول على مواضيع من المناطق الغربية والمتعلمة والصناعية والغنية والديمقراطية، أي منطقة مشكلة غريبة. هذا النقص في التنوع والتمثيل في المواضيع أضر بتعميم العلوم النفسية. ولمعالجة هذه المشكلة، تم إطلاق مشاريع تعاونية دولية واسعة النطاق، ويتم جمع المزيد من البيانات من المناطق غير التابعة لـ WEIRD. ومع ذلك، فمن غير المعروف ما إذا كان الأشخاص من المناطق "غير WEIRD" يمكنهم تمثيل سكانهم المحليين. في هذا البحث التلوي، نخطط لمسح خصائص الموضوعات الصينية المذكورة في الدراسات التجريبية المنشورة في خمس مجلات نفسية صينية رئيسية وفي تعاونات دولية واسعة النطاق. ستقدم النتائج صورة واقعية للمشاركين الصينيين في علم النفس، وسنناقش الحلول المحتملة لمسألة التمثيل في كل من الصين والعالم.

البحث التلوي، علم نفس السكان، التمثيل، WEIRD، القابلية للتعميم

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

El problema EXTRAÑO en un contexto “no EXTRAÑO”: una metainvestigación sobre la representatividad de los sujetos humanos en la investigación psicológica china

 La ciencia psicológica tiene como objetivo comprender la mente y el comportamiento humanos, pero se basa principalmente en sujetos de las regiones occidentales, educadas, industrializadas, ricas y democráticas, es decir, las regiones Problema EXTRAÑO. Esta falta de diversidad y representatividad de los temas comprometió la generalización de la ciencia psicológica. Para abordar este problema, se iniciaron proyectos de colaboración internacional a gran escala y se recopilan más datos de regiones que no son WEIRD. Sin embargo, se desconoce si los sujetos de regiones "no EXTRAÑAS" pueden representar a su población local. En esta metainvestigación, planeamos estudiar las características de los sujetos chinos informadas en estudios empíricos publicados en cinco revistas psicológicas chinas convencionales y en colaboraciones internacionales a gran escala. Los resultados proporcionarán una imagen realista de los participantes chinos en psicología y discutiremos posibles soluciones al problema de la representatividad tanto en China como en todo el mundo.

metainvestigación, psicología de poblaciones, representatividad, WEIRD, generalizabilidad

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Le problème WEIRD dans un contexte « non-WEIRD » : une méta-recherche sur la représentativité des sujets humains dans la recherche psychologique chinoise

 La science psychologique vise à comprendre l'esprit et le comportement humains, mais elle s'appuie principalement sur des sujets issus des régions occidentales, instruites, industrialisées, riches et démocratiques, c'est-à-dire les régions occidentales, instruites, industrialisées, riches et démocratiques. Problème Bizarre. Ce manque de diversité et de représentativité des sujets compromettait la généralisabilité de la science psychologique. Pour résoudre ce problème, des projets de collaboration internationaux à grande échelle ont été lancés et davantage de données sont collectées dans des régions non WEIRD. Cependant, on ne sait pas si les sujets issus de régions « non-WEIRD » peuvent représenter leur population locale. Dans cette méta-recherche, nous prévoyons d'étudier les caractéristiques des sujets chinois rapportées dans des études empiriques publiées dans cinq revues psychologiques chinoises grand public et dans le cadre de collaborations internationales à grande échelle. Les résultats fourniront une image réaliste des participants chinois en psychologie et nous discuterons des solutions potentielles à la question de la représentativité en Chine et dans le monde.

méta-recherche, psychologie des populations, représentativité, WEIRD, généralisabilité

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

"गैर-अजीब" संदर्भ में अजीब समस्या: चीनी मनोवैज्ञानिक अनुसंधान में मानव विषयों की प्रतिनिधित्वशीलता पर एक मेटा-शोध

 मनोवैज्ञानिक विज्ञान का उद्देश्य मानव मन और व्यवहार को समझना है, लेकिन यह मुख्य रूप से पश्चिमी, शिक्षित, औद्योगिक, समृद्ध और लोकतांत्रिक क्षेत्रों, यानी, के विषयों पर निर्भर करता है। अजीब समस्या. विषयों की विविधता और प्रतिनिधित्व की कमी ने मनोवैज्ञानिक विज्ञान की सामान्यता से समझौता किया। इस मुद्दे को हल करने के लिए, बड़े पैमाने पर अंतरराष्ट्रीय सहयोगी परियोजनाएं शुरू की गईं, और गैर-WEIRD क्षेत्रों से अधिक डेटा एकत्र किया गया। हालाँकि, यह अज्ञात है कि क्या "गैर-WEIRD" क्षेत्रों के विषय उनकी स्थानीय आबादी का प्रतिनिधित्व कर सकते हैं। इस मेटा-शोध में, हम पांच मुख्यधारा की चीनी मनोवैज्ञानिक पत्रिकाओं और बड़े पैमाने पर अंतरराष्ट्रीय सहयोगों में प्रकाशित अनुभवजन्य अध्ययनों में रिपोर्ट किए गए चीनी विषयों की विशेषताओं का सर्वेक्षण करने की योजना बना रहे हैं। परिणाम मनोविज्ञान में चीनी प्रतिभागियों की एक यथार्थवादी तस्वीर प्रदान करेंगे, और हम चीन और दुनिया भर में प्रतिनिधित्व के मुद्दे के संभावित समाधानों पर चर्चा करेंगे।

मेटा-अनुसंधान, जनसंख्या मनोविज्ञान, प्रतिनिधित्वशीलता, अजीब, सामान्यीकरण

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

「非WEIRD」文脈におけるWEIRD問題：中国の心理学研究における人間被験者の代表性に関するメタリサーチ

心理学は人間の心と行動を理解することを目的としていますが、主に西洋、教育、工業化、富裕、民主主義の各地域の主題に依存しています。奇妙な問題。この多様性と被験者の代表性の欠如により、心理科学の一般化可能性が損なわれました。この問題に対処するために、大規模な国際共同プロジェクトが開始され、WEIRD 以外の地域からより多くのデータが収集されています。ただし、「非WEIRD」地域の被験者が地元住民を代表できるかどうかは不明です。このメタリサーチでは、中国の主流心理学雑誌5誌に掲載された実証研究や大規模な国際共同研究で報告された中国人被験者の特徴を調査する予定だ。この結果により、心理学における中国人参加者の現実的な姿が明らかになり、中国と世界の両方における代表性の問題に対する潜在的な解決策について議論することになります。

メタリサーチ、集団心理、代表性、WEIRD、一般化可能性

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

O problema ESTRANHO num contexto “não ESTRANHO”: Uma meta-pesquisa sobre a representatividade de sujeitos humanos na pesquisa psicológica chinesa

 A ciência psicológica visa compreender a mente e o comportamento humanos, mas depende principalmente de assuntos das regiões ocidentais, educadas, industrializadas, ricas e democráticas, ou seja, Problema ESTRANHO. Esta falta de diversidade e representatividade dos assuntos comprometeu a generalização da ciência psicológica. Para resolver esta questão, foram iniciados projetos colaborativos internacionais em grande escala e são recolhidos mais dados de regiões não WEIRD. No entanto, não se sabe se indivíduos de regiões “não ESTRANHAS” podem representar a sua população local. Nesta meta-pesquisa, pretendemos pesquisar as características dos sujeitos chineses relatadas em estudos empíricos publicados em cinco importantes revistas psicológicas chinesas e em colaborações internacionais de grande escala. Os resultados fornecerão uma imagem realista dos participantes chineses na psicologia e discutiremos possíveis soluções para a questão da representatividade na China e no mundo.

metapesquisa, psicologia populacional, representatividade, WEIRD, generalização

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

СТРАННАЯ проблема в «НЕСТРАННОМ» контексте: метаисследование репрезентативности людей в китайских психологических исследованиях

 Психологическая наука направлена на понимание человеческого разума и поведения, но она в первую очередь опирается на субъектов из западных, образованных, промышленно развитых, богатых и демократических регионов, т.е. СТРАННАЯ проблема. Отсутствие разнообразия и репрезентативности предметов поставило под угрозу обобщаемость психологической науки. Для решения этой проблемы были инициированы крупномасштабные международные совместные проекты, и больше данных собирается из регионов, не относящихся к WEIRD. Однако неизвестно, могут ли субъекты из «НЕСТРАННЫХ» регионов представлять свое местное население. В этом мета-исследовании мы планируем изучить характеристики китайских испытуемых, о которых сообщается в эмпирических исследованиях, опубликованных в пяти основных китайских психологических журналах, а также в крупномасштабных международных проектах. Результаты дадут реалистичную картину китайских участников психологии, и мы обсудим потенциальные решения проблемы репрезентативности как в Китае, так и во всем мире.

метаисследование, популяционная психология, репрезентативность, WEIRD, обобщаемость

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

“非WEIRD”背景下的WEIRD问题：中国心理学研究中人类受试者代表性的元研究

 心理科学旨在理解人类的思想和行为，但它主要依赖于西方、受过教育、工业化、富裕和民主地区的主题，即奇怪的问题。学科缺乏多样性和代表性损害了心理科学的普遍性。为了解决这个问题，启动了大规模的国际合作项目，并从非 WEIRD 地区收集了更多数据。然而，来自“非WEIRD”地区的受试者是否可以代表当地人口尚不清楚。在这项元研究中，我们计划调查五种中国主流心理学期刊和大规模国际合作中发表的实证研究中报告的中国受试者的特征。研究结果将提供中国心理学参与者的真实情况，我们将讨论中国和全球代表性问题的潜在解决方案。

元研究、群体心理学、代表性、WEIRD、普遍性

Submission: posted 07 September 2021
Recommendation: posted 03 April 2023, validated 07 April 2023

Cite this recommendation as:
Yamada, Y. (2023) How well do "non-WEIRD" participants in multi-lab studies represent their local population?. Peer Community in Registered Reports, . https://rr.peercommunityin.org/articles/rec?id=103

Recommendation

In this protocol, Yue et al. (2023) aim to clarify whether the sample of non-WEIRD countries included in multi-lab studies is actually representative of those countries and cultures. Focusing on China, this study will compare Chinese samples in several multi-lab studies with participants in studies published in leading national Chinese journals on various aspects, including demographic data and geographic information. This work will provide useful information on the extent to which multi-lab studies are able to deal with generalizability, especially as they intend to address the generalizability problem.

The Stage 1 manuscript was reviewed by three experts, including two with an interest in the WEIRD problem and a wealth of experience in open science and multi-lab research, plus an expert in Bayesian statistics, which this manuscript uses. Following multilpe rounds of peer review, and based on detailed responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).

URL to the preregistered Stage 1 protocol: https://osf.io/ehw54

Level of bias control achieved: Level 4. At least some of the data/evidence that will be used to answer the research question already exists AND is accessible in principle to the authors (e.g. residing in a public database or with a colleague) BUT the authors certify that they have not yet accessed any part of that data/evidence.

List of eligible PCI RR-friendly journals:

References

Yue, L., Zuo, X.-N., & Hu, C.-P. (2023) The WEIRD problem in a “non-WEIRD” context: A meta-research on the representativeness of human subjects in Chinese psychological research, in principle acceptance of Version 7 by Peer Community in Registered Reports. https://osf.io/ehw54

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Reviews

Reviewed by Zoltan Dienes, 03 Apr 2023

I am happy with the authors' reply. I can;t seem to access the paper; I am not sure if it is a problem my end. But so long as the information in the reply appears in the paper, than my point has been addressed.

https://doi.org/10.24072/pci.rr.100103.rev61

Evaluation round #5

DOI or URL of the report: https://osf.io/dkmt3/

Version of the report: MD5:6e54b5c53f06135960169a7b3a117a12

Author's Reply, 29 Mar 2023

Download author's reply https://doi.org/10.24072/pci.rr.100103.ar5

Decision by Yuki Yamada, posted 23 Mar 2023, validated 23 Mar 2023

Thank you very much for your effort to further improve the manuscript.
The remaining issues seem to have been focused on the age analysis. The reviewer was, very thankfully, extremely prompt in providing comments on the manuscript. I regret that I have to ask you to revise the manuscript several times, but please revise it again so that the proper analysis can be conducted.

Yuki Yamada, Recommender

https://doi.org/10.24072/pci.rr.100103.d5

Reviewed by Zoltan Dienes, 23 Mar 2023

The authors show they actually can pick up reasonable proportion differences using counts (which is what should be used for a multinomial). But for the age case, they have misunderstood me: I in no way meant they should use a t-test. I meant they should indicate what sort of effects thier procedure can pick up, using the same logic as they have e..g used here: "an
article reported 30 participants, with age = 23.3 ± 3.5, we estimate the approximate number of
participants under 20 is 5 (r code: `round((pnorm(20, mean = 23.3, sd =3.5) * 30))`), the number
of participants aged between 21 ~ 30 is 24 (r code: `round((pnorm(30, mean = 23.3, sd =3.5) *
30)) - 5`), and participant aged between 30 ~ 40 will be 1." Sure a multinomial analysis can pick up more than just mean differences in age. But one still needs to show the model of H1 as a uniform has sensible properties, given such a model does not reflect a relevant theory. So one way of showing the sensitivity of the analysis is to imagine a diference only in means of a certain amount, translate that into the age bins, run the multinomial analysis, and see what it can pick up. At the moment all we know - from what is provided in the actual paper - is that it can get evidence for H1 when the sample data are generated from a uniform; but that is such an unlikely position for the real data to be in that it does not tell us much. Sorry to go on, but my point has not been addressed.

https://doi.org/10.24072/pci.rr.100103.rev51

Evaluation round #4

DOI or URL of the report: https://osf.io/dkmt3/

Version of the report: MD5:17e9d57ee757b4d34f0d74f00e6700b7

Author's Reply, 23 Mar 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100103.ar4

Decision by Yuki Yamada, posted 20 Feb 2023, validated 20 Feb 2023

Again, the reviewer commented. I think the discussion is getting quite mature, but there are still some insufficiencies in the in-text explanation and rationale of the justification for priors. In a normal peer review, I would prefer not to go back and forth too many times because it consumes both parties' resources, but I think it is important that key points are agreed upon between the authors and the reviewer, and I would like to continue for another round. I hope you will find those points of agreement, taking into account the past comments of this reviewer.

Yuki Yamada, Recommender

https://doi.org/10.24072/pci.rr.100103.d4

Reviewed by Zoltan Dienes, 18 Feb 2023

The authors have made a major step in addressing my point. (Incidentally I didn't find the supplementary materials, so I don't comment on them.) I think however the discussion of this point should be in the main text when they introduce the prior (model of H1) that they use and they need to elaborate more to justify the scientific relevance of this model of H1 given these results. In effect if their reference sample had 50% males, their comparison would need about 67% or more males (or 33% or less) to be detected as different -is this reasonable in this context? (The model of H1 presumes any difference in proportion is as probable as any other - and one consequence of that vague assumption is a large difference is needed to detect any difference.) I am not sure I find that reasonable. (That is why I always use scientifically informed priors.) Can the authors argue for a reasonable prior or argue that this prior is reasonable *given their particular scientific/empirical context*?

For the age result, the average reader will not be able to interpret what this means in terms of what age differences could be detected. That is why I suggested modeling age as a normal and showing what diference in mean ages could be detected. Find a way to present the sensitivity of the method to age differences that is intuitively graspable.

https://doi.org/10.24072/pci.rr.100103.rev41

Evaluation round #3

DOI or URL of the report: https://osf.io/dkmt3/

Version of the report: MD5:4fa349f4ca2ede8882c7b09e38cbe4f2

Author's Reply, 11 Feb 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100103.ar3

Decision by Yuki Yamada, posted 21 Jan 2023, validated 21 Jan 2023

Thank you for your thorough revisions. As you can see, many issues have been resolved and several of the reviewers are satisfied.
Nevertheless, there remain issues related to the analysis that require further consideration. That is, the setting of the prior distribution and the inferences that can be drawn from it. Please read the reviewer's comment carefully for more details. I agree that this issue should be resolved before granting IPA. I look forward to seeing the revised manuscript again.

Yuki Yamada

https://doi.org/10.24072/pci.rr.100103.d3

Reviewed by Zoltan Dienes, 20 Jan 2023

The planned analyses are now more clearly presented. But a main issue I raised has not been dealt with. The authors are using uniform priors. Such default priors do not typically reflect what a plausible theory would predict; which means no plausible theory is typically being tested. The way to answer the concern is to show that in this situation the default uniform is reasonable. How can one show that? One way is to indicate how the BF performs on imaginary data showing plausible ways H1 may be true. For example, give a sex ratio that deviates from the H0 by a relatively small amount - what is the smallest amount for which the BF still gives good evidence for H1? Age is more complex as it is multi-df; but one may proceed as the authors have by assuming age is normally distributed and find the smallest difference in means for which one just gets evidence for H1. (If one assumes normality one could argue one should use Bayesian t-tests for age. But one could also treat normality as just one possibility for checking how the test behaves.) This approach would be the simplest. More thorough would be showing what population effect would lead to say a 80% chance of being detected. Conversely one should show that if there is no effect, there is sufficient N to obtain evidence for H0. This is of course unlikely to be a problem with the planned sample size, but one should always show this in a registered report (the logic of planning for a severe test is given here: https://psyarxiv.com/yc7s5/).

The authors responded to my point by saying they will check robustness with a different prior. This suggestion in itself leaves open inferential flexibility: What conclusions will be reached if the different priors lead to different conclusions? So the authors need to be clear on what basis they will draw particular conclusions. Simplest would be to justify one prior as most suitable (e.g. because of the way it behaves as described above), indicate all conclusions will be wrt this prior, and the other is for background information only.

https://doi.org/10.24072/pci.rr.100103.rev31

Reviewed by Kai Hiraishi, 13 Jan 2023

Download the review https://doi.org/10.24072/pci.rr.100103.rev32

Reviewed by Patrick Forscher, 04 Jan 2023

I have read the revised manuscript and the authors' response to reviewers. I was largely satisfied with the previous draft of the manuscript and I'm also satisfied with this one. I'd like to see this manuscript accepted so that I can read the authors' results. :)

This is a great project -- one I'll certainly be keeping an eye on!

Patrick S. Forscher

Associate Director

Busara Center for Behavioral Economics

https://doi.org/10.24072/pci.rr.100103.rev33

Evaluation round #2

DOI or URL of the report: https://osf.io/dkmt3/

Version of the report: MD5:248b05d98f61aaf0c70a236c20e6df8c

Author's Reply, 03 Jan 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100103.ar2

Decision by Yuki Yamada, posted 06 Oct 2022

Both of the previous two reviewers responded favorably to the revisions made by the authors, and I agree with them. Thank you very much for making substantial revisions and resolving many of the issues. I would then again like to invite the authors to submit a revised manuscript.

We first added a third reviewer in this round who will focus exclusively on the part of Bayesian statistics. Then, considering all the peer review comments, the authors should provide a more specific explanation and justification for the hypotheses they are setting here.

Also, the individual peer review comments make some points about the target population. Since this is also involved in the hypothesis setting, I still think further justification is needed.

As the third reviewer pointed out, there is still some ambiguity in the design table. Please make it more specific so that the hypothesis, analysis, and interpretation correspond in a straight line.

This study has the potential to be interesting and important, and I would like to encourage the revision of it.

Yuki Yamada, Recommender

https://doi.org/10.24072/pci.rr.100103.d2

Reviewed by Kai Hiraishi, 02 Oct 2022

Download the review https://doi.org/10.24072/pci.rr.100103.rev21

Reviewed by Patrick Forscher, 26 Sep 2022

I think the authors have done a thorough and admirable job addressing my comments. I only have two remaining (potential) and concerns.

First, I still wonder whether we should expect research samples to exactly represent the general population from which they were drawn. To take an extreme example, let's imagine that a research community goes through a period of doing lots of research on anxiety. One would expect that research field to include lots of highly anxious participants -- moreso than one would expect in the general population. But this lack of representativeness is intended and, I think, justified because it is necessary to achieve the researchers' goals. If this research community stays fixated on anxiety for an extended period of time, one could critique that community for being too focused on one topic at the expense of other valuable topics that are relevant to non-anxious people, but I think periods of focus on one topic can be justified.

To be fair to the authors, they have included some codes of the researchers' intended generalization -- but I think the findings will need to be interpreted carefully with the relationship between samples, populations, and research goals in mind. So, I don't see a strong need for revision -- this is just something to keep in mind for the discussion section (with, perhaps, a few tweaks to the framing of the paper's goals).

Second, I do have some lingering concerns about the analysis plan. One part of this concern is linked to my comments about whether one expects exact representativeness in research samples, as this expectation will be encoded in the prior. I just wonder whether the Bayes factors are comparing the right models. Maybe they are, as long as the discussion section contextualizes the results appropriately (ie it makes clear that sampling decisions are or should be a function of research goals) -- so perhaps no action is needed on this point. My other concern about the analysis plan is that it may need to be critiqued by someone with more Bayesian expertise than either I or the other reviewer can provide. I think this is an issue for the editor to decide.

At any rate, I think this is an interesting and valuable project and I'm looking forward to seeing where the authors go with it.

I sign all my reviews,

Patrick S. Forscher

Research Lead, Busara Center for Behavioral Economics

patrick.forscher@busaracenter.org

https://doi.org/10.24072/pci.rr.100103.rev22

Reviewed by Zoltan Dienes, 14 Sep 2022

I will comment just on the choice of analysis.

p 10: "and the Ha is not H0" and also footnote 2 "for others, the Bayesian hypothesis testing can be done without specifying the alternative hypothesis"

In fact, a Bayes factor always requires a specification of H1 because one has to calculate the probability of the data given H1, and this can only be done if H1 is some particular distribution. Where it seems not to be done, e.g. in the Hoijtink reference, it is done implicitly; and in the current case of a default, there is an explicit distribution, it is just that it is chosen without reference to the specific scientific problem. In this case, the authors themselves claim there is a distribution for H1, so the statements cited made above should be deleted. However, I think the model of H1 used as a default by the authors is not exactly equal fixed probabilities in each cell, as might be read from their description. Rather it is the distribution of probabilities in each cell is the same. What the authors need to do is say what this distribution is, and briefly justify its relevance.

Incidentally, to see that there is a distribution involved, try in JASP "Bayesian multinomial test" which I think is what the authors are using, if one specifies the same expected counts as the counts for the prior (model of H1), the Bayes factor is not 1. That is because the prior/model of H1 uses a distribution of probabilities in each cell.

In terms of justifying their model, the authors can show what Bayes factors are obtained for different deviations from expected proportions. This will indicate what size deviations their analysis is sensitive to, given their Ns. They should do this in order to show the severity of their tests: Is it likely that the tests will find evidence against their hypotheses, given reasonable assumptions about what size deviations there might be?

The Design Table needs to be more specific. List each hypothesis that will be tested, giving the exact test, and stating under what conditions the hypothesis will be deemed supported or refuted (e.g. what BF threshold).

https://doi.org/10.24072/pci.rr.100103.rev23

Evaluation round #1

DOI or URL of the report: https://osf.io/dkmt3/

Author's Reply, 12 Sep 2022

Download author's reply Download tracked changes file

see attachments

https://doi.org/10.24072/pci.rr.100103.ar1

Decision by Yuki Yamada, posted 28 Jul 2022

Thank you very much for granting PCI-RR the opportunity to peer-review your paper.
At the same time, I sincerely apologize for the long delay in responding to you.

This manuscript was reviewed by two very experienced researchers who are interested in the WEIRD issue. Frankly speaking, the purpose of this study is favorably viewed by the reviewers as well as by myself, and it is desirable that this study be properly carried out.

This will require careful elaboration by a major revision, especially with respect to sample representativeness, coding, and sampling methods, as the reviewers have pointed out.

One reviewer also raised concerns about the placement of hypotheses and the setting of prior distributions in the Bayes factor analysis. It would be good to have this point clarified, but if necessary, please let us know that we can ask an expert in Bayesian statistics to check this point as an additional reviewer. In that case, we will ask them to focus the scope of their review on this point, so we do not expect it to take as long as it has so far.

I am looking forward to your revised manuscript.

Yuki Yamada

https://doi.org/10.24072/pci.rr.100103.d1

Reviewed by Kai Hiraishi, 04 May 2022

Download the review https://doi.org/10.24072/pci.rr.100103.rev11

Reviewed by Patrick Forscher, 27 Jul 2022

The authors propose to assess the representativeness of participants in Chinese psychology research. To this end, they propose to compare samples from five different sources:

Samples from five mainstream Chinese journals
Chinese samples from large-scale international collaborations
Non-Chinese samples from large-scale international collaborations
The National Bureau of Statistics of China
The Chinese Family Panel Study

They pursue their goal of assessing the representativeness of participants in research in China with five activities, which use the samples illustrated in 1-5:

Compare samples from Chinese journals (1) to Chinese samples in large-scale collaborations (2);
Compare samples from Chinese journals (1) to census data (4 and 5);
Compare Chinese samples from international collaborations (2) to non-Chinese samples in international collaborations (3)

I love the concept of this project. Psychology has long paid too little attention to sampling. Most of the time this problem gets described under the umbrella of the “WEIRD problem”, but it can be construed more broadly as a “generalizability problem”. The problem can even be construed more deeply as an issue with who defines the samples and topics that are interesting to study and how we draw conclusions about those samples and topics. I think this project could advance our understanding of these kinds of problems.

I don’t have many specific problems with the proposed study – instead, I have some suggestions for the authors to consider. Some of these suggestions may broaden the scope of the research. I think it would be fine if the authors declined some of these – so consider these suggestions as possibilities that the editor and authors can think about together as the protocol is revised.

My comments are divided into four sections:

Broad aims
Coded characteristics
Data sources
A note on the analysis plan

Broad aims

Although I love the topic of the project, I could imagine a skeptic wondering whether anyone would expect Chinese samples to perfectly represent the Chinese population. Researchers are supposed to choose the sampling methods that allow them to accomplish their research goals. Sometimes this involves random sampling to accomplish representativeness, but sometimes it doesn’t – as is the case when researchers simply want to show that a psychological phenomenon exists in any population at all. This defense is actually the very one offered by Mook (1983) in response to early claims that psychology has a generalizability problem (https://www.vanderbilt.edu/psychological_sciences/graduate/programs/quantitative-methods/quantitative-content/mook_1983.pdf).

However, there are some powerful responses to Mook’s argument:

Although some of psychology research involves existence proofs, many research topics require going beyond existence proofs. This is especially the case with research that has applied aspirations – it doesn’t really matter if you can get something to work in the lab if it doesn’t work in the real world
If researchers focus too much on the experiences and concerns of a narrow sub-population, they will miss phenomena that are experienced outside of that sub-population (see https://osf.io/preprints/africarxiv/xd269/). I like to think of these missed phenomena as “unknown unknowns” – research psychologists can’t even know that they are missing them because their measures and datasets don’t include the necessary information to know this
Researchers choose research priorities based on their own experiences. If researchers are also drawn from a narrow subpopulation, they will choose research priorities that are important to that subpopulation, creating a distorted view of human psychology (see https://www.proquest.com/docview/527775905?pq-origsite=gscholar&fromopenview=true)

I think the authors should consider Mook’s arguments and the responses to it. Doing so might inform the aims and design of this study, as well as the information that is coded from each data source (see my next point, below).

Coded characteristics

The characteristics that are coded should be selected to accomplish the project’s broad aims. Because I think these broad aims might need a bit of adjustment, and because the specific adjustments should be decided by the authors, I won’t be too prescriptive with my suggestions about what to code. However, I do have a few thoughts that will, hopefully, help the authors think through what sorts of characteristics to select.

If the authors want to show that research in China is too focused on a specific research aim, such as the existence proof, they might consider coding some characteristics that capture the match between aim and sampling method. This might include, for example, the type of sampling the authors implemented (convenience, online panel, probability, etc), the type of research (exploratory or confirmatory), and/or the setting (lab, field, online, etc). They might find some ideas of what to code in this article on Arabic social psychology by Saab and colleagues (2020; https://journals.sagepub.com/doi/abs/10.1177/1948550620925224).

If the authors want to capture the types of topics the authors select (and maybe, compare the topics in Chinese language journals to those in big team science initiatives), it might be worth coding something about the broad topic of study. Ideally, this would use a pre-existing coding system (such as the article keywords) to lower burden on the coders. This was a focus in a commentary I co-wrote on African psychology (https://osf.io/preprints/africarxiv/xd269/); we didn’t do a systematic coding of topics, but instead tried to give a holistic sense of how African priorities might differ from Western priorities.

If the authors want to assess who’s setting the research priorities, they might want to code where the lead authors of each article are from and/or what their background is. This is an approach taken in Thalmeyer and colleagues (2021; https://serval.unil.ch/resource/serval:BIB_38DE994E17E6.P001/REF) – a more recent update to Arnett (2008) that the authors might find useful to scan.

Another similar possibility is to code the abstracts of the papers the authors sample for whether the source of the sample is mentioned, which could tell the authors who researchers take as the implicit “default participant”. This is an approach taken by Kahalon and colleagues (2021; https://journals.sagepub.com/doi/10.1177/19485506211024036).

Data sources

I two brief notes on the data sources the authors have chosen.

Chinese journals. I must admit to ignorance as to the landscape of Chinese-language psychology journals, so I can’t really evaluate whether the five Chinese-language journals are a good representation of this landscape. For the benefit of readers like me, can the authors provide some description of how these journals were chosen – and maybe, of the landscape of Chinese journals generally?
Big team science initiatives. The landscape of ManyLabs-style initiatives (or, as I like to call them, “big team science” initiatives; see https://psyarxiv.com/2mdxh/) has grown a lot since the first ManyLabs studies. If you need a list of possible data sources for this style of study, you might want to consult this spreadsheet (https://docs.google.com/spreadsheets/d/1BUURnm0CvwubyYJSp_YFj0ntWLCHHOl_TDXsQgkSN1Q/edit#gid=0), which is compiled and maintained by Dwayne Lieck and Daniel Lakens

A note on the analysis plan

The proposed analysis is very detailed and uses Bayesian methods that I don’t feel qualified to review in detail. However, I felt generally that the specific analyses may be too focused on evaluating whether Chinese samples are “exactly representative” of the Chinese population. I would advise more thought on the broad goals of the research and the characteristics that need to be coded to achieve those broad goals, then revising the analysis plan.

Conclusion

I love the topic of this proposal and want to see the finished product. I don’t have strong views on the direction authors ought to take the protocol – though I do think they might benefit from reflecting a bit on the project’s broad aims. This would give them the opportunity to sharpen the specific goals and research activities so that their project is as impactful as possible.

I sign all my reviews,

Patrick S. Forscher

Research Lead, Busara Center for Behavioral Economics

patrick.forscher@busaracenter.org

(PS: I noticed a few minor English usage mistakes. These didn’t factor into my evaluation at all, which is why I am writing this note at the bottom. However, if the authors want someone to do some quick copy edits whenever they’re looking to submit this to a journal, I’d be willing to help them with this)

https://doi.org/10.24072/pci.rr.100103.rev12

User comments

No user comments yet

or Register
Submit a report