DOI or URL of the report: https://osf.io/hfte9?view_only=506d243a6e7a4d3680c81e696ca81025
Version of the report: 1 (inside folder Stage2_submission)
I think the manuscript mostly meets the key criteria for Stage 2 acceptance: the authors appears to have conducted the study as described and interpreted the results sensibly according to their pre-specified Stage 1 criteria, with appropriate caveats in the Discussion and appropriate weighting in the abstract.
The only minor question I have about following protocol is why the final participant total was 62 instead of the 60 participants proposed. Was this because of collecting more than 60 in case of exclusions? I’d just suggest adding a sentence somewhere to make this explicit.
I do think, however, that there are a few structural changes required to confirm with PCI-RR policies:
I note that the authors appear to have altered the introduction from the version that received IPA to add recently published references (e.g., Ostrega et al., 2024). This contradicts PCI-RR policy (https://rr.peercommunityin.org/PCIRegisteredReports/about/full_policies):
“Aside from changes in tense (e.g. future tense to past tense), correction of typographic and grammatical errors, and correction of clear factual errors, the introduction, rationale and hypotheses of the Stage 2 submission must remain identical to those in the approved Stage 1 manuscript. To make any changes clear, authors are required to submit a tracked changes version of the manuscript at Stage 2.”
It is commendable to incorporate recent studies, but I'd suggest that the revised version cites new references not cited in the Stage 1 protocol in the Discussion section instead. (I think it is OK to update references for preprints already cited at Stage 1 to their recently published final versions - e.g., Albouy et al., 2024; Ozaki et al., 2024).
I note the authors have also included exploratory analyses in the results section before the official "Exploratory analysis" section ("Liking ratings differed for each of the styles in pairwise comparisons withall other styles (all ps <.001; based on average ratings of sessions 1 and 2 and adjusting p-values for multiple comparisons with the Holm method. Note that these comparisons were not preregistered; they were included for completeness, since it seemed reasonable to first present the distribution of our dependent variable)." These (and any other analyses not part of the Stage 1 confirmatory analyses) should be moved to the "Exploratory analyses" section.
Finally, the manuscript refers to some supplementary figures but these are not visible in the manuscript. I suggest the supplementary figures be merged with the main manuscript (after the reference section).
Other points:
In general I recommend changing the title at Stage 2 submission to something that is more informative about the actual results
I recommend including more details about the sample in the abstract (e.g., 62 experiment participants, lyrics in Brazilian Portuguese) [sorry I didn't catch this in Stage 1!]. I recommend this editorial for thinking about how to make abstracts and titles more informative: https://www.nature.com/articles/s41562-023-01596-8
Fig. 1: I recommend visualizing individual datapoints in addition to averages/distributions (https://www.nature.com/articles/s41551-017-0079)
Line 549: “r(20) = .37(0.087); and r(20) = .37 (p= .089)”: is it missing a “p=” before “0.087”? Personally I’m not sure it is helpful to even report p-values at all for exploratory analyses, but if you do want to report them you should fix that typo.
Line 571: “mirrorred” typo
I’d remove “highly” from the abstract - feels a bit strong
Line 588: I agree that Cronbach’s alpha is “an inadequate measure of interrater agreement”, but you might want to support this with a reference
Line 672: “(and actually equivalent to lullaby singing)” - remove parentheses and “actually”
Line 675: “definetively” typo
Line 721: “the prediction of some consistency of average preferences for some voices across styles was supported by the found interstyle agreement of .52, which, according to the specified threshold of .8, is not considered highly consistent” - this wording is confusing - perhaps instead of “some consistency”, “limited consistency” (as in the abstract) would be better? Grammar of “found interstyle…” also feels a bit awkard.