Introduction
The Transparency and Openness Promotion (TOP) guidelines are a series of modular standards for transparency and reproducibility in published research. For background, authors are referred to the TOP overview at the Center for Open Science and the TOP introductory article published in Science (see page 1424 of the introductory article for a table explaining the meaning of the various TOP levels). As a signatory to the TOP guidelines, PCI RR is committed to regular review of its adherence to TOP. We currently adopt the following levels within each of the eight TOP standards, which can range from Level 0 to Level 3. Publication in PCI RR is contingent on authors adhering to these standards where they apply.
At the point of Stage 1 and Stage 2 submission, authors are asked to complete a short checklist confirming adherence to the TOP guidelines.
Standard #1: Citation Standards (Level 3)
All data, program code and other methods must be appropriately cited. Such materials are recognised as original intellectual contributions and afforded recognition through citation. Articles will not be published until the citations conform to these standards.
-
All data sets and program code used in a publication must be cited in the text and listed in the reference section
-
References for data sets and program code must include a persistent identifier, such as a Digital Object Identifier (DOI). Persistent identifiers ensure future access to unique published digital objects, such as a text or data set. Persistent identifiers are assigned to data sets by digital archives, such as institutional repositories and partners in the Data Preservation Alliance for the Social Sciences (Data-PASS).
-
Data set citation example:
-
Campbell, A., and Kahn, R. L. (1999). American National Election Study, 1948. ICPSR07218-v3. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. http://doi.org/10.3886/ICPSR07218.v3
Standards #2, #3, and #4: Data, Analytic Methods (Code), and Research Materials Transparency (Level 2)
The policy of PCI RR is to publish papers only if the data, methods used in the analysis, and any digital materials used to conduct the research are clearly and precisely documented and are maximally available to any researcher for purposes of reproducing the results or replicating the procedure.
Authors reusing data available from public repositories must provide program code, scripts for statistical packages, and other documentation sufficient to allow an informed researcher to precisely reproduce all published results. Potential repositories that support open or embargoed archiving include (but are not limited to) Zenodo, Figshare, Harvard Dataverse, Dryad, the Knowledge Network for Biocomplexity, and the Open Science Framework. For a comprehensive list of available data repositories, see http://www.re3data.org/
Authors using original data must:
- Make appropriately anonymised data available within a trusted digital repository OR provide a statement in the manuscript and TOP submission checklist explaining why data are not publicly archived and how data can otherwise be accessed (Note: If all data required to reproduce the reported analyses appears in the article text, tables, and figures then it does not also need to be posted to a repository.)
- Include all variables, treatment conditions, and observations described in the manuscript.
- Provide a full account of the procedures used to collect, preprocess, clean, or generate the data.
- Where applicable, provide program code, scripts, codebooks, and other documentation sufficient to precisely reproduce all published results OR provide a statement in the TOP submission checklist explaining why code is not publicly archived and how any such code can otherwise be accessed.
- Provide digital research materials (e.g. stimuli, code) and description of procedures necessary to conduct an independent replication of the research OR provide a statement in the TOP submission checklist explaining why research materials are not publicly archived and how they can otherwise be accessed.
-
In some cases, some or all data, code or digital materials cannot be shared for legal or ethical reasons. For example, in some studies, patient data can be impossible to fully anonymise, or authors may lack ethical permission to archive even fully anonymised data. In other cases, experimental materials (such as stimuli, questionnaires) or analysis code might be proprietary and may therefore be unpublishable. PCI RR will grant exceptions to data, code and material access requirements provided authors:
-
as outlined above, explain the restrictions on the data, code or materials and how they preclude public access. Example text in the TOP submission checklist might include:
-
“The conditions of our ethics approval do not permit public archiving of anonymised study data. Readers seeking access to the data should contact the lead author X or the local ethics committee at the Department of Y, University of Z. Access will be granted to named individuals in accordance with ethical procedures governing the reuse of sensitive data. Specifically, requestors must meet the following conditions to obtain the data [insert any conditions, e.g. completion of a formal data sharing agreement, or state explicitly if there are no conditions].”
-
“Legal copyright restrictions do not permit us to publicly archive the full set of stimuli used in this experiment. Readers seeking access to the stimuli are advised to contact the lead author X or copyright holder [insert details]. Stimuli will be released on the following conditions [insert any conditions or state explicitly if there are no conditions].”
-
- provide a public description of the steps others should follow to request access to the data, code or digital materials – e.g. through direct contact with authors, the relevant ethics committee or other external authority.
- provide software and other documentation that will precisely reproduce all published results.
- provide access to all data, code and digital materials for which the constraints do not apply.
-
- Where shared publicly, any data, code, digital materials, and other documentation of the research process should be made available through a trusted digital repository. Trusted repositories adhere to policies that make data discoverable, accessible, usable, and preserved for the long term. Trusted repositories also assign unique and persistent identifiers. Author maintained websites are not compliant with this requirement.
- Dissemination of these digital materials may be delayed until publication. Under exceptional circumstances, the PCI RR Managing Board may grant an embargo of the public release of data for at most one year after publication.
- Authors are responsible for ensuring that their articles continue to meet these conditions. Failure to do so may lead to an expression of concern or retraction of the article.
- The above policy ensures that PCI RR is fully compliant with the Peer Reviewers’ Openness (PRO) initiative.
Standard #5: Design and Analysis Transparency (Level 2)
The policy of PCI RR is to publish papers where authors follow standards for disclosing key aspects of the research design and data analysis. Authors are encouraged to review the standards available for many research applications from http://www.equator-network.org/ and use those that are relevant for the reported research applications. As part of the TOP submission checklist at Stage 2, authors are required to confirm the following text (based on the 21-word solution proposed by Simmons et al, 2012), with elaboration as appropriate in the main body of the manuscript: "We report how we determined our sample size, all data exclusions (if any), all data inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study".
Standards #6 and #7: Preregistration of Studies and Analysis Plans (Level 3)
The policy of PCI RR is to publish papers where the study design and analysis plan has been preregistered in an independent, institutional registry (e.g., http://clinicaltrials.gov/, http://socialscienceregistry.org/, https://osf.io/, https://egap.org/registry/, http://ridie.3ieimpact.org/). Preregistration of studies involves registering the study design, variables, and treatment conditions prior to conducting the research. Including an analysis plan involves specification of sequence of analyses or the statistical model that will be reported. PCI RR performs this preregistration on behalf of authors at the point of Stage 1 in-principle acceptance (IPA), and a link to the preregistration in an institutional registry will be made available to authors as part of the Stage 1 IPA recommendation.
In the TOP guidelines checklist at Stage 2 submission, authors must confirm that the study was preregistered in an independent, institutional registry. Authors are also required to:
-
Confirm that the study was registered prior to conducting the research with links to the time-stamped preregistration(s) at the institutional registry, and that the preregistration adheres to the disclosure requirements of the institutional registry OR those required for the preregistered badge with analysis plans maintained by the Center for Open Science.
-
Report all pre-registered analyses in the text, or, if there were changes in the analysis plan following preregistration, those changes must be disclosed with explanation for the changes.
-
Clearly distinguish in text analyses that were preregistered from those that were not.
Standard #8: Replication (Level 3)
The policy of PCI RR is to encourage submission of replication studies. As with all studies recommended by PCI RR, replication studies are reviewed in two stages as Registered Reports.
Tips for avoiding delay in handling your manuscript
TOP compliance is assessed by PCI RR recommenders at the point of Stage 2 submission. During the submission process, authors are asked to complete a brief checklist that addresses each of the TOP standards. The manuscript and responses are assessed by PCI RR and only proceed to review once TOP-compliance is confirmed.
A significant number of revised manuscripts are returned to authors for failing to comply with the TOP policy, delaying the review process by days and sometimes weeks. Below are the most common reasons for rejection. Avoiding these oversights will significantly accelerate the handling of your manuscript. To further avoid delay, we recommend considering the ethical, legal and practical challenges in archiving your data, digital materials, and code well in advance of preparing your Stage 2 manuscript.
Below are the main reasons revised manuscripts are returned to authors before entering the review process:
- Manuscript fails to include page numbers, which are necessary to confirm that the manuscript includes the required information and transparency statements.
- Authors responding “No” or “N/A” in the checklist to indicate lack of archiving of data, digital materials or code, but failing to provide full details of why archiving is not possible in the text box that accompanies each question. Responding “No” or “N/A” without explanation will always result in the manuscript being returned to the author.
- Failure to ensure that any information provided in the checklist answers is also stated in the manuscript. Failure to provide the page numbers to this information in the checklist as instructed.
- The inclusion of vague statements such as "data/code/materials will be available upon (reasonable) request from X" without specifying the full details of the process for providing data, and without defining any qualifiers such as "reasonable”. Note that under PCI RR policy, authors must publicly archive (e.g. on https://osf.io or any other institutional repository) all individual anonymised data (at raw and summary level) that are necessary and sufficient to reproduce all analyses and data presentations reported in the paper and supplementary information (for all types of data reported, including any behavioural, clinical and imaging data), or alternatively that authors must explain in the manuscript the legal or ethical (e.g. confidentiality) barriers to anonymised public archiving and provide an explanation in the manuscript for how readers can access the data. Where such restrictions apply, remember to make clear which individual or organisation is responsible for granting or refusing data access requests (e.g. corresponding author; ethics committee; legal authority), and the precise conditions requestors must meet to obtain the data (e.g. data sharing agreement, legal agreement, collaboration agreement). If data access requests are granted by the corresponding author without any conditions then this must be stated explicitly.
- Stating that data/materials/code are not publicly archived for ineligible reasons, such as the intention to publish further papers from the same dataset, or because the data is not sufficiently organised for archiving. Barring rare exceptions negotiated in advance with the PCI RR recommender, the only permissible reasons for lack of public archiving of data, materials and analysis code are legal or ethical barriers.
- Failure to publicly archive materials or code due to legal restrictions that are self-imposed by the authors and that would otherwise not be required by the authors’ institution or other external legal authority. For example, authors sometimes attempt to use their own assertion of copyright over analysis code as a basis for refusing to publicly archive the code. This is not permitted under PCI RR policy. Where the authors have complete legal power to publicly archive materials, and there are no ethical barriers to doing so, the materials must be publicly archived. Authors are encouraged to use a GNU or Creative Commons license to control reuse and attribution: see https://www.gnu.org/licenses/licenses.en.html or https://creativecommons.org/licenses/
- Invoking copyright restrictions on the public archiving of any digital materials or code, but failing to state the identity of the copyright holder in the manuscript. Where the copyright holder is unknown (e.g. as can happen in some cases where stimuli used in a study are sourced from the internet) then this should be stated explicitly.
- In cases where data, materials or code are archived, failure to ensure there is a README file or codebook included in the archive that explains the content of every file and any variable labels within files.
- In cases where data, materials or code are archived, failure to use a trusted digital repository that adheres to policies that make data discoverable, accessible, usable, and preserved for the long term. Trusted repositories also assign unique and persistent identifiers. Note that author maintained websites (including Dropbox, Google sites, or Google drive folders) are not compliant with this requirement. Potential repositories that support open or embargoed archiving include (but are not limited to) Zenodo, Figshare, Harvard Dataverse, Dryad, the Knowledge Network for Biocomplexity, and the OSF. For a comprehensive list of available data repositories, see http://www.re3data.org/
- Failure to ensure that a public archive is either publicly accessible or that the manuscript includes a private view-only link. For example, where using the OSF, it is acceptable for authors to keep an archive private until acceptance, provided the manuscript includes the private view-only URL that is accessible to the reviewers and recommender (see here for guidance on how to create private view-only links: https://help.osf.io/hc/en-us/articles/360019930333-Create-a-View-only-Link-for-a-Project). If the manuscript is accepted, the private view-only link is then replaced with a publicly accessible link.
- Where some of the data, digital materials or analysis code are archived, but some are not, failure to state explicitly in the manuscript which data/materials/code are missing from the archive, and the legal or ethical restrictions that apply to accessing those missing data/materials/code.
- Where invoking ethical restrictions, specifically, for archiving of any digital materials (e.g. experiment task code, stimuli, assessment batteries) or analysis code, failure to explain in the manuscript what ethically sensitive content is contained in these materials or code and is therefore subject to ethical restrictions.
- Where the following statement is correct: “We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study", failure to include this exact statement in the manuscript. Note that it is usually not compliant to divide this statement between different pages or sections.
- Stating in the TOP checklist that “All data/materials/code are contained in the manuscript” when this is not the case.
- Stating in the TOP checklist that “There is no analysis code” but then referring to analysis code in the manuscript (e.g. R code, custom Matlab scripts etc).
- Stating in the TOP checklist that “There are no digital study materials” but then describing experiments presented using code, stimuli, questionnaires etc.