Recommendation

Improving the measurement of social connection

Dorothy Bishop based on reviews by Jacek Buczny, Richard James and Alexander Wilson

A recommendation of:

STAGE 1

A systematic review of social connection inventories

Bastien Paris, Debora Brickau, Tetiana Stoianova, Maike Luhmann, Christopher Mikton, Julianne Holt-Lunstad, Marlies Maes, Hans IJzerman https://osf.io/preprints/psyarxiv/6ueyd version 3

Read report on server

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

A systematic review of social connection inventories

Social connection is vital to health and longevity. To date, a plethora of instruments exists to measure social connection, assessing a variety of aspects of social connection like loneliness, social isolation, or social support. For comparability and consistency of the published literature and for policy recommendations, consolidation and evaluation of the quality of measures is crucial. To answer the call for comparability, in Study 1a, we conducted a systematic review to create a database of social connection measures (N=xx) for its structure (N=xx), function (N=xx), and quality components (N=xx), spanning [YEAR] to [YEAR]; after which, in Study 1b, we assessed the heterogeneity of these existing measures through an item-content analysis relying both on human coders, as well as ChatGPT. We identified a total of XX item categories (XX for structure, XX for function, and XX for quality components) with a Jaccard index of XX for structure, XX for function, and XX for quality components. To answer the call for quality assessment, in Study 2a, we conducted a second systematic review on the measures found in Study 1a, creating a database documenting overall validity evidence. In Study 2b, we then evaluated the measurement properties using the COnsensus-based Standards for the Selection of health Measurement Instruments. We found the measurement properties to be [sufficient / insufficient / inconsistent / indeterminate], [sufficient / insufficient / inconsistent / indeterminate], and [sufficient / insufficient / inconsistent / indeterminate]; with [high/moderate/low/very low], [high/moderate/low/very low], and [high/moderate/low/very low] quality of evidence for the structure, function, and quality components, respectively. Finally, we identified the country of origin of the measures and the population groups with which they were developed, using data from Study 1a. Most of the measures were developed in [country name] (XX%) and for [add population characteristics] (XX%). [Overall conclusion].

measurement, social connection, social isolation, loneliness, social support, systematic review, quality assessment

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

مراجعة منهجية لقوائم جرد الاتصال الاجتماعي

التواصل الاجتماعي أمر حيوي للصحة وطول العمر. حتى الآن، يوجد عدد كبير من الأدوات لقياس الارتباط الاجتماعي، وتقييم مجموعة متنوعة من جوانب الاتصال الاجتماعي مثل الوحدة، أو العزلة الاجتماعية، أو الدعم الاجتماعي. من أجل المقارنة والاتساق بين الأدبيات المنشورة والتوصيات المتعلقة بالسياسات، يعد توحيد وتقييم جودة التدابير أمرًا بالغ الأهمية. للإجابة على دعوة المقارنة، أجرينا في الدراسة 1أ مراجعة منهجية لإنشاء قاعدة بيانات لمقاييس الارتباط الاجتماعي (N=xx) لبنيتها (N=xx)، ووظيفتها (N=xx)، ومكونات الجودة (N =xx)، ويمتد من [YEAR] إلى [YEAR]؛ وبعد ذلك، في الدراسة 1ب، قمنا بتقييم عدم تجانس هذه التدابير الحالية من خلال تحليل محتوى العنصر بالاعتماد على كل من المبرمجين البشريين، وكذلك ChatGPT. لقد حددنا إجمالي فئات العناصر XX (XX للهيكل، وXX للوظيفة، وXX لمكونات الجودة) مع فهرس Jaccard لـ XX للهيكل، وXX للوظيفة، وXX لمكونات الجودة. للرد على الدعوة لتقييم الجودة، أجرينا في الدراسة 2 أ مراجعة منهجية ثانية للتدابير الموجودة في الدراسة 1 أ، وإنشاء قاعدة بيانات توثق أدلة الصلاحية الشاملة. في الدراسة 2ب، قمنا بعد ذلك بتقييم خصائص القياس باستخدام المعايير القائمة على الإجماع لاختيار أدوات قياس الصحة. لقد وجدنا أن خصائص القياس هي [كافية / غير كافية / غير متسقة / غير محددة]، [كافية / غير كافية / غير متسقة / غير محددة]، و [كافية / غير كافية / غير متسقة / غير محددة]؛ مع جودة أدلة [عالية/متوسطة/منخفضة/منخفضة جدًا]، و[عالية/متوسطة/منخفضة/منخفضة جدًا]، و[عالية/متوسطة/منخفضة/منخفضة جدًا] على مكونات البنية والوظيفة والجودة، على التوالي. أخيرًا، حددنا البلد الأصلي للتدابير والمجموعات السكانية التي تم تطويرها معها، باستخدام بيانات من الدراسة 1أ. تم تطوير معظم المقاييس في [اسم البلد] (XX%) ومن أجل [إضافة خصائص سكانية] (XX%). [الاستنتاج العام].

القياس، التواصل الاجتماعي، العزلة الاجتماعية، الوحدة، الدعم الاجتماعي، المراجعة المنهجية، تقييم الجودة

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Una revisión sistemática de los inventarios de conexiones sociales.

La conexión social es vital para la salud y la longevidad. Hasta la fecha, existe una gran cantidad de instrumentos para medir la conexión social, evaluando una variedad de aspectos de la conexión social como la soledad, el aislamiento social o el apoyo social. Para la comparabilidad y coherencia de la literatura publicada y para las recomendaciones de políticas, la consolidación y evaluación de la calidad de las medidas es crucial. Para responder al llamado de comparabilidad, en el Estudio 1a, realizamos una revisión sistemática para crear una base de datos de medidas de conexión social (N=xx) para su estructura (N=xx), función (N=xx) y componentes de calidad (N =xx), que abarca desde [AÑO] hasta [AÑO]; después de lo cual, en el Estudio 1b, evaluamos la heterogeneidad de estas medidas existentes a través de un análisis de contenido de ítems basado tanto en codificadores humanos como en ChatGPT. Identificamos un total de XX categorías de ítems (XX para estructura, XX para función y XX para componentes de calidad) con un índice Jaccard de XX para estructura, XX para función y XX para componentes de calidad. Para responder al llamado de una evaluación de la calidad, en el Estudio 2a, realizamos una segunda revisión sistemática de las medidas encontradas en el Estudio 1a, creando una base de datos que documenta la evidencia de validez general. En el Estudio 2b, luego evaluamos las propiedades de medición utilizando los Estándares basados en el consenso para la selección de instrumentos de medición de la salud. Descubrimos que las propiedades de medición eran [suficiente/insuficiente/inconsistente/indeterminada], [suficiente/insuficiente/inconsistente/indeterminada] y [suficiente/insuficiente/inconsistente/indeterminada]; con calidad de evidencia [alta/moderada/baja/muy baja], [alta/moderada/baja/muy baja] y [alta/moderada/baja/muy baja] para los componentes de estructura, función y calidad, respectivamente. Finalmente, identificamos el país de origen de las medidas y los grupos poblacionales con los que fueron desarrolladas, utilizando datos del Estudio 1a. La mayoría de las medidas se desarrollaron en [nombre del país] (XX%) y para [agregar características de la población] (XX%). [Conclusión general].

medición, conexión social, aislamiento social, soledad, apoyo social, revisión sistemática, evaluación de calidad

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Une revue systématique des inventaires de liens sociaux

Les liens sociaux sont essentiels à la santé et à la longévité. À ce jour, il existe une multitude d’instruments pour mesurer le lien social, évaluant divers aspects du lien social comme la solitude, l’isolement social ou le soutien social. Pour la comparabilité et la cohérence de la littérature publiée et pour les recommandations politiques, la consolidation et l’évaluation de la qualité des mesures sont cruciales. Pour répondre à l'appel de comparabilité, dans l'étude 1a, nous avons mené une revue systématique pour créer une base de données de mesures de connexion sociale (N=xx) pour sa structure (N=xx), sa fonction (N=xx) et ses composants de qualité (N =xx), s'étendant de [ANNÉE] à [ANNÉE] ; après quoi, dans l'étude 1b, nous avons évalué l'hétérogénéité de ces mesures existantes grâce à une analyse du contenu des éléments s'appuyant à la fois sur des codeurs humains et sur ChatGPT. Nous avons identifié un total de XX catégories d'articles (XX pour la structure, XX pour la fonction et XX pour les composants de qualité) avec un indice Jaccard de XX pour la structure, XX pour la fonction et XX pour les composants de qualité. Pour répondre à l'appel en faveur d'une évaluation de la qualité, dans l'étude 2a, nous avons mené une deuxième revue systématique des mesures trouvées dans l'étude 1a, créant ainsi une base de données documentant les preuves de validité globale. Dans l'étude 2b, nous avons ensuite évalué les propriétés de mesure à l'aide des normes basées sur le consensus pour la sélection des instruments de mesure de la santé. Nous avons constaté que les propriétés de mesure étaient [suffisantes/insuffisantes/incohérentes/indéterminées], [suffisantes/insuffisantes/incohérentes/indéterminées] et [suffisantes/insuffisantes/incohérentes/indéterminées] ; avec une qualité de preuve [élevée/modérée/faible/très faible], [élevée/modérée/faible/très faible] et [élevée/modérée/faible/très faible] pour les composantes de structure, de fonction et de qualité, respectivement. Enfin, nous avons identifié le pays d'origine des mesures et les groupes de population avec lesquels elles ont été élaborées, à l'aide des données de l'étude 1a. La plupart des mesures ont été élaborées dans [nom du pays] (XX %) et pour [ajouter les caractéristiques de la population] (XX %). [Conclusion générale].

mesure, lien social, isolement social, solitude, soutien social, revue systématique, évaluation de la qualité

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

सामाजिक संबंध सूची की एक व्यवस्थित समीक्षा

सामाजिक जुड़ाव स्वास्थ्य और दीर्घायु के लिए महत्वपूर्ण है। आज तक, सामाजिक संबंध को मापने के लिए बहुत सारे उपकरण मौजूद हैं, जो अकेलेपन, सामाजिक अलगाव या सामाजिक समर्थन जैसे सामाजिक संबंध के विभिन्न पहलुओं का आकलन करते हैं। प्रकाशित साहित्य की तुलनीयता और स्थिरता के लिए और नीतिगत सिफारिशों के लिए, उपायों की गुणवत्ता का समेकन और मूल्यांकन महत्वपूर्ण है। तुलनीयता के लिए कॉल का उत्तर देने के लिए, अध्ययन 1 ए में, हमने इसकी संरचना (एन = एक्सएक्स), फ़ंक्शन (एन = एक्सएक्स), और गुणवत्ता घटकों (एन) के लिए सामाजिक कनेक्शन उपायों (एन = एक्सएक्स) का डेटाबेस बनाने के लिए एक व्यवस्थित समीक्षा की। =xx), [वर्ष] से [वर्ष] तक फैला हुआ; जिसके बाद, अध्ययन 1बी में, हमने मानव कोडर और चैटजीपीटी दोनों पर निर्भर आइटम-सामग्री विश्लेषण के माध्यम से इन मौजूदा उपायों की विविधता का आकलन किया। हमने संरचना के लिए XX, फ़ंक्शन के लिए XX और गुणवत्ता घटकों के लिए XX के जैकार्ड इंडेक्स के साथ कुल XX आइटम श्रेणियों (संरचना के लिए XX, फ़ंक्शन के लिए XX और गुणवत्ता घटकों के लिए XX) की पहचान की। गुणवत्ता मूल्यांकन के आह्वान का उत्तर देने के लिए, अध्ययन 2ए में, हमने अध्ययन 1ए में पाए गए उपायों पर दूसरी व्यवस्थित समीक्षा की, जिससे समग्र वैधता साक्ष्य का दस्तावेजीकरण करने वाला एक डेटाबेस तैयार हुआ। अध्ययन 2बी में, हमने स्वास्थ्य माप उपकरणों के चयन के लिए सर्वसम्मति-आधारित मानकों का उपयोग करके माप गुणों का मूल्यांकन किया। हमने माप गुणों को [पर्याप्त / अपर्याप्त / असंगत / अनिश्चित], [पर्याप्त / अपर्याप्त / असंगत / अनिश्चित], और [पर्याप्त / अपर्याप्त / असंगत / अनिश्चित] पाया; क्रमशः संरचना, कार्य और गुणवत्ता घटकों के लिए साक्ष्य की गुणवत्ता [उच्च/मध्यम/निम्न/बहुत कम], [उच्च/मध्यम/निम्न/बहुत कम] और [उच्च/मध्यम/निम्न/बहुत कम] के साथ। अंत में, हमने अध्ययन 1ए के डेटा का उपयोग करके उपायों की उत्पत्ति के देश और उन जनसंख्या समूहों की पहचान की जिनके साथ उन्हें विकसित किया गया था। अधिकांश उपाय [देश का नाम] (XX%) और [जनसंख्या विशेषताएँ जोड़ें] (XX%) के लिए विकसित किए गए थे। [समग्र निष्कर्ष].

माप, सामाजिक संबंध, सामाजिक अलगाव, अकेलापन, सामाजिक समर्थन, व्यवस्थित समीक्षा, गुणवत्ता मूल्यांकन

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

社会的つながりの目録の体系的なレビュー

社会的なつながりは健康と長寿にとって不可欠です。現在までに、社会的つながりを測定するための手段が数多く存在し、孤独、社会的孤立、社会的サポートなどの社会的つながりのさまざまな側面を評価しています。出版された文献の比較可能性と一貫性、および政策提言のためには、対策の質の統合と評価が非常に重要です。比較可能性の要求に応えるために、研究 1a では、構造 (N=xx)、機能 (N=xx)、および品質要素 (N=xx) について、社会的つながりの尺度 (N=xx) のデータベースを作成する系統的レビューを実施しました。 =xx)、[年] から [年] まで。その後、研究 1b では、人間のプログラマーと ChatGPT の両方に依存したアイテム内容分析を通じて、これらの既存の尺度の異質性を評価しました。合計 XX 個のアイテムカテゴリ (構造は XX、機能は XX、品質コンポーネントは XX) を特定し、Jaccard インデックスは構造が XX、機能が XX、品質コンポーネントが XX でした。品質評価の要求に応えるために、研究 2a では、研究 1a で見つかった対策について 2 回目の系統的レビューを実施し、全体的な妥当性の証拠を文書化したデータベースを作成しました。研究 2b では、健康測定機器の選択に関するコンセンサスに基づく基準を使用して測定特性を評価しました。測定特性は、[十分 / 不十分 / 矛盾 / 不定]、[十分 / 不十分 / 不一致 / 不定]、および [十分 / 不十分 / 不一致 / 不定] であることがわかりました。構造、機能、および品質コンポーネントの証拠の品質は、それぞれ [高/中/低/非常に低い]、[高/中/低/非常に低い]、および [高/中/低/非常に低い] です。最後に、研究 1a のデータを使用して、対策の実施国とその開発に使用された人口グループを特定しました。ほとんどの対策は [国名] (XX%) および [人口特性の追加] (XX%) 向けに開発されました。 [全体的な結論]。

測定、社会的つながり、社会的孤立、孤独、社会的サポート、系統的レビュー、品質評価

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Uma revisão sistemática dos inventários de conexões sociais

A conexão social é vital para a saúde e a longevidade. Até à data, existe uma infinidade de instrumentos para medir a ligação social, avaliando uma variedade de aspectos da ligação social, como a solidão, o isolamento social ou o apoio social. Para a comparabilidade e consistência da literatura publicada e para recomendações políticas, a consolidação e avaliação da qualidade das medidas são cruciais. Para responder ao apelo à comparabilidade, no Estudo 1a, conduzimos uma revisão sistemática para criar uma base de dados de medidas de ligação social (N=xx) para a sua estrutura (N=xx), função (N=xx) e componentes de qualidade (N). =xx), abrangendo [ANO] a [ANO]; após o que, no Estudo 1b, avaliamos a heterogeneidade dessas medidas existentes por meio de uma análise de conteúdo de item baseada tanto em codificadores humanos quanto no ChatGPT. Identificamos um total de XX categorias de itens (XX para estrutura, XX para função e XX para componentes de qualidade) com índice Jaccard de XX para estrutura, XX para função e XX para componentes de qualidade. Para responder ao apelo à avaliação da qualidade, no Estudo 2a, conduzimos uma segunda revisão sistemática sobre as medidas encontradas no Estudo 1a, criando uma base de dados que documenta evidências de validade globais. No Estudo 2b, avaliamos então as propriedades de medição usando os Padrões baseados no CONsenso para a Seleção de Instrumentos de Medição de saúde. Descobrimos que as propriedades de medida são [suficiente/insuficiente/inconsistente/indeterminado], [suficiente/insuficiente/inconsistente/indeterminado] e [suficiente/insuficiente/inconsistente/indeterminado]; com qualidade de evidência [alta/moderada/baixa/muito baixa], [alta/moderada/baixa/muito baixa] e [alta/moderada/baixa/muito baixa] para os componentes de estrutura, função e qualidade, respectivamente. Por fim, identificamos o país de origem das medidas e os grupos populacionais com os quais foram desenvolvidas, utilizando dados do Estudo 1a. A maioria das medidas foi desenvolvida em [nome do país] (XX%) e para [adicionar características da população] (XX%). [Conclusão geral].

medição, conexão social, isolamento social, solidão, apoio social, revisão sistemática, avaliação de qualidade

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Систематический обзор реестров социальных связей

Социальные связи жизненно важны для здоровья и долголетия. На сегодняшний день существует множество инструментов для измерения социальных связей, оценивающих различные аспекты социальных связей, такие как одиночество, социальная изоляция или социальная поддержка. Для сопоставимости и последовательности опубликованной литературы, а также для политических рекомендаций решающее значение имеет консолидация и оценка качества мер. Чтобы ответить на призыв к сопоставимости, в исследовании 1a мы провели систематический обзор, чтобы создать базу данных показателей социальных связей (N = xx) для ее структуры (N = xx), функции (N = xx) и качественных компонентов (N =xx), охватывающий от [ГОД] до [ГОД]; после чего в исследовании 1b мы оценили неоднородность этих существующих показателей посредством анализа содержания элементов, опираясь как на людей-кодировщиков, так и на ChatGPT. Мы определили в общей сложности XX категорий элементов (XX для структуры, XX для функции и XX для качественных компонентов) с индексом Жаккара XX для структуры, XX для функции и XX для качественных компонентов. Чтобы ответить на призыв к оценке качества, в исследовании 2a мы провели второй систематический обзор показателей, обнаруженных в исследовании 1a, создав базу данных, документирующую общие доказательства достоверности. Затем в исследовании 2b мы оценили свойства измерений, используя основанные на консенсусе стандарты для выбора инструментов измерения здоровья. Мы обнаружили, что свойства измерения являются [достаточными/недостаточными/непоследовательными/неопределенными], [достаточными/недостаточными/непоследовательными/неопределенными] и [достаточными/недостаточными/непоследовательными/неопределенными]; с [высоким/средним/низким/очень низким], [высоким/средним/низким/очень низким] и [высоким/средним/низким/очень низким] качеством доказательств для компонентов структуры, функции и качества соответственно. Наконец, мы определили страну происхождения показателей и группы населения, для которых они были разработаны, используя данные исследования 1a. Большинство мер были разработаны в [название страны] (XX%) и для [добавить характеристики населения] (XX%). [Общий вывод].

измерение, социальные связи, социальная изоляция, одиночество, социальная поддержка, систематический обзор, оценка качества

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

社会关系清单的系统回顾

社会联系对于健康和长寿至关重要。迄今为止，存在大量的工具来衡量社会联系，评估社会联系的各个方面，例如孤独、社会孤立或社会支持。为了使已发表文献和政策建议具有可比性和一致性，措施质量的整合和评估至关重要。为了满足可比性的要求，在研究 1a 中，我们进行了系统回顾，为其结构 (N=xx)、功能 (N=xx) 和质量组成部分 (N=xx) 创建了一个社会联系测量 (N=xx) 数据库。 =xx)，跨度[年]至[年]；之后，在研究 1b 中，我们通过依赖人类编码员以及 ChatGPT 的项目内容分析来评估这些现有指标的异质性。我们确定了总共 XX 个项目类别（XX 为结构，XX 为功能，XX 为质量成分），结构的 Jaccard 指数为 XX，功能为 XX，质量成分为 XX。为了响应质量评估的要求，在研究 2a 中，我们对研究 1a 中发现的措施进行了第二次系统审查，创建了一个记录整体有效性证据的数据库。在研究 2b 中，我们使用基于共识的健康测量仪器选择标准来评估测量属性。我们发现测量属性为[充足/不足/不一致/不确定]、[充足/不足/不一致/不确定]、[充足/不足/不一致/不确定]；结构、功能和质量成分的证据质量分别为[高/中/低/极低]、[高/中/低/极低]和[高/中/低/极低]。最后，我们利用研究 1a 的数据确定了这些措施的起源国以及制定这些措施的人口群体。大多数措施是在[国家名称] (XX%) 和[添加人口特征] (XX%) 中制定的。 [总体结论]。

测量、社会联系、社会孤立、孤独、社会支持、系统评价、质量评估

Submission: posted 09 July 2023
Recommendation: posted 18 January 2024, validated 19 January 2024

Cite this recommendation as:
Bishop, D. (2024) Improving the measurement of social connection. Peer Community in Registered Reports, . https://rr.peercommunityin.org/articles/rec?id=495

Recommendation

This is an ambitious systematic review that uses a combination of quantitative and qualitative methods to make the measurement of the construct of social connection more rigorous. Social connection is a heterogeneous construct that includes aspects of structure, function and quality. Here, Paris et al. (2024) will use predefined methods to create a database of social connection measures, and will assess heterogeneity of items using human coders and ChatGPT. This database will form the basis of a second systematic review which will look at evidence for validity and measurement properties. This study will also look at the population groups and country of origin for which different measures were designed, making it possible to see how far culturally specific issues affect the content of measures in this domain.

The questions asked by this study are exploratory and descriptive and so the importance of pre-registration is in achieving clear criteria for how each question is addressed, rather than evidential criteria for hypothesis-testing.

The authors responded comprehensively to three reviewer reports. This study will provide a wealth of useful information for those studying social connection, and should serve to make the literature in this field more psychometrically robust and less fragmented.

URL to the preregistered Stage 1 protocol: https://osf.io/796uv

Level of bias control achieved: Level 3. At least some data/evidence that will be used to the answer the research question has been previously accessed by the authors (e.g. downloaded or otherwise received), but the authors certify that they have not yet observed ANY part of the data/evidence.

List of eligible PCI RR-friendly journals:

References

1. Paris, B., Brickau, D., Stoianova, T., Luhmann, M., Mikton, C., Holt-Lunstad, J., Maes, A., & IJzerman, H. (2024). A systematic review of social connection inventories. In principle acceptance of Version 3 by Peer Community in Registered Reports. https://osf.io/796uv

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Reviews

Reviewed by Richard James, 16 Jan 2024

The authors have done an extremely thorough job of responding to these comments. The revised manuscript is extremely comprehensive, and has been thoughtfully revised in a manner that substantially improves the Stage 1 Report. Overall, I am happy for this to go forward to Stage 2, with potential for minor changes but nothing that requires fundamental revisions.

The only point I wanted to explain further was with regard to the forward searching. My concern here was that the approach taken assumes that (1) studies using these inventories will be captured within the search terms and (2) focused searches with the measure name and methodological/measurement terms (which I thought were comprehensive) will pick out the relevant studies reporting validity. The search terms included for study 2 are thorough, and from initial searches in study 1 the principal search terms produced several hundred thousand results. As such I think the first assumption is fair. However, I am less certain of the second, which assumes these details are repliably captured in the bibliographic details of existing studies. Having considered it further, I also think this issue is likely to vary by domain as well, as for some properties (e.g. cross-cultural validity, measurement invariance, structural validity) this is often the focal point of a published article and so will be flagged prominently, whereas for others (e.g. criterion validity, internal consistency) it will not. The suggestion for the forward searches was mainly with the view of confirming the sensitivity of the search terms to pick up relevant articles assessing the validity of these domains. While I agree with the authors' hypothesis that these measures will perform poorly on the COSMIN taxonomy domains, it might be the case that relevant data is systematically missing because of deficiencies in the reporting of these statistics. However, I am also conscious that depending on the number of inventories identified this might entail substantial work. As such, I think this can be left to the author's discretion, perhaps either testing a sample of measures to check whether this is redundant or not, or discussing the issue as a potential limitation in the stage 2 RR.

One thought on the redundancy of the search terms: for the reporting of the stage 2 RR, it would be possible to quantify the number of unique entries by extracting the DOIs and titles (for articles without DOIs).

Minor comment: H2 on the design table - I'd revise from "will show insufficient evidence of great measurement properties" to "show evidence of insufficient measurement validity".

Otherwise, I look forward to seeing how this comes out in the Stage 2 Report!

https://doi.org/10.24072/pci.rr.100495.rev21

Reviewed by Alexander Wilson, 12 Dec 2023

I enjoyed reading the authors' response to the reviewers' comments. I was impressed by the level of detail and felt that the authors addressed the reviewers' queries very well. I was particularly pleased that the authors gave careful consideration to refining the methodology of study 1 (item coding) as this was where my main concerns lay. I hope the changes make the coding more robust.

Good luck with carrying out the review. I look forward to reading the results.

Alex Wilson

https://doi.org/10.24072/pci.rr.100495.rev22

Evaluation round #1

DOI or URL of the report: https://osf.io/spt8m?view_only=b3e7ac48db7342ca87ff70e23658ec82

Version of the report: 1

Author's Reply, 29 Nov 2023

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.rr.100495.ar1

Decision by Dorothy Bishop, posted 26 Sep 2023, validated 26 Sep 2023

Dear Dr IJzerman and colleagues

Thank you for your patience in waiting for a decision on your manuscript. I now have three thorough reviews. My impression is that, though the reviews are detailed, they do not raise any issues that you won't be able to deal with. They are mostly concerned with clarification rather than suggestions for major methodological changes.

As someone who is not familiar with this domain of assessment (other than having recently completed a questionnaire on social connection as a Biobank participant!) I have one question, which relates to the way in which the 3 different aspects of social connection are interpreted. Specifically, I wondered whether the structural measures are regarded as a kind of context against which function and quality would be evaluated. For instance, opportunities for social connection, and a mismatch between those opportunities and reality, might be rather different for someone who is employed vs retired, or for someone who is single by choice rather than widowed or divorced. In other words, do researchers in this area use the structural measures to stratify samples when considering the impact of function and quality? This is an idle thought prompted by my curiosity from being on the receiving end of a questionnaire, and perhaps a distraction from your main aims, so feel free to ignore if this isn't something that can be usefully addressed here.

The paper is different from the kinds of registered report I usually see, which tend to be empirical, hypothesis-testing studies, but I can see the value of pre-registering the methods for a complex piece of work like this. It will give the study more authority by demonstrating that decisions were made according to rigorous predetermined criteria, rather than ad hoc. It was very useful to have your scripts and mock-up data available to clarify any questions about the analysis, and to give confidence that the analyses will be able to be conducted in a timely fashion.

I would encourage you to submit a revised manuscript that addresses the comments of reviewers, and look forward to seeing it in due course.

https://doi.org/10.24072/pci.rr.100495.d1

Reviewed by Richard James, 05 Sep 2023

This Stage 1 Registered Report proposes an ambitious programme of research, utilising systematic review in Study 1a to draw together different indices of social connection, and categorise them into three previously proposed domains: structural, functional, and qualitative. Then in Study 1b, it is proposed that an item content analysis will be undertaken on these measurements to categorise them into different sub-domains, and subsequently assess the extent to which there is overlap in content of measures that have been used to measure social connection. Then, in Study 2a the systematic review process will be extended to extract data from studies that have utilised the measurements reviewed in Study 1. In Study 2b, the extracted data will be evaluated using the COSMIN taxonomy to assess measurement properties, and whether measurement invariance has been established between countries and populations studied in subsequent research.

I thought the Registered Report was really well written and thought out, and it sounds like a really exciting piece of research. This is an extremely ambitious piece of work that I think has the potential to make a major contribution to improving measurement in this area. Although I am not a specialist on social connectedness, my own experience with population-wide data where measures of social connection have been collected highlights this is being a glaring problem that can easily prevent the development of our understanding in a number of directions (e.g. cross-national comparisons, use of poorly validated or invalid measures to draw fragile or biased conclusions).

That being said, I did have some specific comments that it would be good to get the authors' consideration on and make changes as necessary. I had some methodological comments where there is scope for making minor amendments to strengthen the approach. Also, while I thought the manuscript was very well written, some of the research questions (RQ 3 and 4) didn't seem to be strongly represented in the RR itself, and this is an area where I felt this could be revised. These are mostly pretty minor to be honest though, and are included below in the order of presentation in the manuscript:

- RQ 3 and RQ4: I didn't think these really came out in the Stage 1 Report as being key aims of the research. I thought the area where this was most clearly referred to was in the abstract. Reading through the report without reference to the RQ table, my impression would be that the results are be reporting the country/population of study i.e. to represent coverage in the literature, rather than the application of these measures to other contexts is meaningful (i.e through use of measurement invariance). The paragraph from lines 190-203 makes the case strongly for the importance of testing these questions, but I thought the end of the paragraph from line 204-212 ought to make it clearer that the aim of this exercise is to assess whether these generalizations are defendable. Similarly I would recommend re-working Section 2 a bit, ideally with a specific sub-heading in the methods for 2b highlighting this is a specific set of analyses, and how these will be reported in the results, with reference to the proposed findings on whether the measures have been validated in countries/populations where it has been applied.

- My main concern for Study 1 relates to the justification for the structural indicators searches. I completely understand that parsing through 400,000+ results is not feasible or an effective use of time. However, the use of a random subsample has potential drawbacks. Specifically, I have reservations that the variety of different types of structural indicator would be captured by random sample of a similar number of results as the number of functional and qualitative indicators. Given the information the authors' have presented, my impression is that there will be much greater heterogeneity among structural indicators relative to the functional and qualitative ones. Second, given the issues reported in the Stage 1 submission so far, it seems fair to expect the results to be far noisier. I wondered whether it might be preferable to stratify the sample to capture a subset of the most relevant results and a random sample (sorted by time), but am also conscious this has its own drawbacks. I would appreciate the author's thoughts on this, and some additional justification of the sampling approach in revising the methods section.

- The justification for 60% agreement on the item content analysis raises questions. Again, understand given the potential range and heterogeneity of measures how this would be difficult. I think some additional justification of this criterion would be useful with reference to specific studies where this has been a problem.

- Study 2a: I agree with the overall approach for the systematic review, and the searches are specifically defined to identify appropriate studies. The only concern I had was that the search strategy relies on the studies clearly flagging this, which in my own experience of gathering data to examine a scale or scope the literature isn't guaranteed.

I would like the authors to consider whether there would be value including select forward citation searches of key papers relating to the scales identified (i.e. initial validation papers), to ensure any relevant studies aren't missed. Otherwise, I agree with not conducting further reviews in the structural indicators domain given the use of single item scales. If validated measures do come up though from that search, and the use of forward citation search may be a reasonable adjustment to ensure these studies are properly captured.

- Study 2a: Reading through this, I wondered whether it would be worth specifically recording whether a non-standard or modified use of a measurement was applied as a variable in the template for extracting sample characteristics. I really liked the use of the COSMIN taxonomy to systematize the quality of the measurements, and think it is a particular strength of evaluating the measurement properties of the scales to be examined. However, I'm also conscious COSMIN doesn't capture some questionable measurement practices that are important in qualifying the use of many measurements, especially where the inconsistent use of measures is a key problem. From my own experience of scoping across a large literature, I find that I quickly begin encountering studies where existing scales have been modified (i.e. different response scales, subsets of questions), and that some scales are more susceptible to it than others (e.g. length of questionnaire, use of many or very few response options). When thinking about the Stage 2 discussion, this might also reinforce some of the evaluation of these measures.

In terms of PCI:RR's review criteria:

- 1A: Scientific validity of the research question: The scientific validity of the research questions is clear and obvious. This is an area where there is a clear need to understand and improve measurement practices, and the authors take a rigorous approach to understanding and evaluating the problems at hand.

- 1B. The logic, rationale, and plausibility of the proposed hypotheses, as applicable: Not directly relevant as the RR does not propose hypotheses.

- 1C. The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis or alternative sampling plans where applicable): Generally this was very well thought out. The OSF has detailed instructions regarding the literature search, coding of the studies and how that will segue into the formal analysis. I have included some minor comments on the methodological approach.

- 1D. Whether the clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures and analysis pipeline and to prevent undisclosed flexibility in the procedures and analyses: Yes. The authors are extremely clear with the reporting of their literature searches.

- 1E. Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls; other quality checks) for ensuring that the obtained results are able to test the stated hypotheses or answer the stated research question(s): Not directly relevant, as the findings are not being compared between conditions.

https://doi.org/10.24072/pci.rr.100495.rev11

Reviewed by Jacek Buczny, 25 Sep 2023

Dear Authors,

This RR is a very interesting project. There are multiple psychological constructs that are operationalized in diverse ways, and almost everyone would admit that it makes it very hard to reproduce and replicate psychological studies. I want to provide a few suggestions in my review, hoping you will find them useful.

First, the research questions/hypothesis seem well-rooted in theory.

Secondly, the analytical techniques correspond well with the research questions and can provide an adequate hypothesis test. The use of COSMIN is a very good idea; however, to me, it is not clear how the tool will be used. Of course, conducting an evaluation based on COSMIN is not a difficult task, but for reproducibility, more details on how you want to use would be welcome.

Thirdly, you mention PRISMA for the first time in the result section of Study 2b. Why at this stage? I wonder why because PRISMA is a general framework used for conducting systematic reviews. I would expect that PRISMA is mentioned in the overview of Study 2a. Instead, you want to use COSMIN guidelines only. What was the particular reason not to use PRISMA to design the study? In addition, on p. 25, you write that the analysis code can be found at https://osf.io/wfers – unfortunately, I could not find it. A similar comment applies to this: https://osf.io/n7z4y.

Fourthly, I wonder why you have not considered applications of the Social Relationship Model (SRM) as an important source of instruments. The SRM posits that social perception/evaluation/traits variance can be partitioned into various components: target variance, perceiver variance, relationship variance, and error variance, and such information can be collected by implementing a round-robin design. I can imagine that specific aspects of social connections (e.g., social support, responsiveness, quality) can be attributed to each type of variance and depend on each other. This gap is a bit puzzling as a thorough understanding of social relationships should account for such specific components.

Fifthly, maybe I overlooked it while reading, but it is unclear whether you will evaluate the quality of the theory used to create a specific measurement. I can imagine a scenario in which a good measure is created (reliable, valid, invariant across genders and cultures), but the background theory is rather weak.

Despite the critical comments, I find the protocol clear and well-prepared for implementation. Before recommending the current version of the document, I would like to know what other reviewers indicated and what you reaction to my comment is.

All the best,
Jacek Buczny

https://doi.org/10.24072/pci.rr.100495.rev12

Reviewed by Alexander Wilson, 10 Sep 2023

Download the review https://doi.org/10.24072/pci.rr.100495.rev13

User comments

No user comments yet

or Register
Submit a report