DOI or URL of the report: https://osf.io/pb5nu
Version of the report: 2
I examined the revised report after it was submitted and I believe that the authors have thoroughly address the comments from the reviewers. However, we noticed that the wording of hypothesis 3 in study 3 lacked clarity and there was redundancy between hypothesis 2 and 3 in the same study.
In the words of the lead author "The new Study 3 H2 logically follows from the combination of H1 and H3. If there's no relationship between total playtime and wellbeing (H1 true) + no differences between genres (H3 true) then H2 has to be false (genres do not differ from 0), whereas if there is a relationship between total playtime and wellbeing (H1 false) + no differences between genres (H3 true) then H2 has to be true (all genres differ from 0)." Based on this, the authors prefer to cut H2 for clarity and reduce the number of Study 3 hypotheses back to 2 without losing any information.
I agree with this assesment, hence, the authors are now asked to make this revision and resubmit their report.
DOI or URL of the report: https://osf.io/ne74d?view_only=a64402603dca42f8ae5f789b09f4afce
Version of the report: 1
Please find our responses to reviewer comments attached.
Dear authors,
Thank you for a well-written report as judged by all reviewers. I have gone through the report and the reviews and I believe that the report is in an excellent shape. There are, however, relatively minor clarifications needed. Most of it is resolved with linguistic clarification but some requires thinking more carefully about your procedure. I would encourage you to especially consider the feedback on the methods, controls, selection&exclusion criteria clarifications, as these will bear the most weight on the success of stage 2.
Looking forward to the revised report!
All the best,
Lobna
I read this Stage 1 submission with great interest. Since I am not a field expert, I leave specialist assessment of the theory and rationale to specialist reviewers and focus my review on on general Registered Reports evaluation criteria and methodological rigour.
Overall I felt this was a very clear and impressive submission that tackles a series of important research questions in an innovative way. The combination of digital trace data with longitudinal psychological data, together with a focus on reproducibility and transparency, seems (to this non-specialist) to be an ideal vehicle for moving this field forward. I also judge that the three programmatic components of the submission are sufficiently substantive to justify separate Stage 2 ouputs.
I have very few comments, but offer the following suggestions to help maximise the quality of the Stage 1 proposal:
1. I suggest including a summary of the sample planning precision estimates in the section Sample Size Determination. It is fine to include the bulk of this in supplementary information, but there should be enough content in the main manuscript to provide a general overview (whereas at the moment there really isn't enough meat in the main manuscript).
2. "We do not preregister any further exclusion criteria; in case of further quality checks (e.g., using careless; Yentes & Wilhelm, 2021) identifying additional responses to exclude, we will report results with both minimal and maximal exclusions applied." This sounds ok but, of course, any post hoc exclusion criteria should applied strictly to exploratory analyses. I'm assuming this is what the authors intend; if not, then these further quality checks must be precisely prespecified.
3. Please add the various positive controls to the design tables, and in the main manuscript (and the interpretration column of the design table for the corresponding row), note the consequences for evaluation of the main hypotheses in the event that one or more of the positive controls fail. I would also strongly encourage a sensitivity power analysis for these positive controls. The success of positive controls can be critical for Stage 2 acceptance (see criterion 2A) so it is very much in the authors' own interests to be sure that the design is sufficiently robust and sensitive to capture these sanity checks.
4. “Responses where the two duplicate items differ by more than 1 scale point will be flagged for manual inspection of potential careless responding.” – please define the precise rule for exclusion. What specific signs will constitute careless responding?
5. “We anticipate approximately 10% attrition per wave of the panel study, and 30% total attrition for the diary study. “ Since participants who are excluded or drop-out are (presumably) not replaced, please specifiy a minimum sample size that will considered sufficient to answer each research question (and test each hypothesis) and therefore justify a Stage 2 submission. I am assuming this will be some number substantially below 1000 (and probably below 700 - i.e. 1000 minus 30% attrition minus maximum tolerable exclusions).
Signed:
Chris Chambers, Cardiff University
PCI RR Managing Board
I thank the managing board of PCI Registered Reports and the authors for the opportunity to peer review this interesting stage 1 report using digital trace data alongside longitudinal wellbeing data to explore the quality and context of play on a large scale.
Reviewer’s disclosure:
I am a first-year PhD-student, and lack expertise to comment on things like measures and statistical R-analysis. I hope my comments are still useful for the authors.
Technical:
-The ORCID-link supposedly for Przybylski A. is actually the link to Ballou’s ORCID.
-The authors may consider registering Limitations already at Stage 1, e.g. for reflexive reporting
Basic Psychological Needs in Games and Wellbeing (Study 1)
The simultaneous validity testing of BANG hypotheses and expanding it is impressive and commendable. I hope its use will give qualitative context to the idea of problematic displacements through games. Consider the subjectivity and limitations of self-reporting complicated displacements and related information.
Game Genres and Wellbeing (Study 3)
The choice to use structured metadata repositories for genre categorization is sufficiently justified, and the acknowledgment of the limitations of self-report and researcher-ascribed taxonomies is valuable.
One of these justifications is the accomodation of genre fluidity and evolution (p. 13). This is true within the context of contemporary, user-generated tags and genres. However, I recommend consideration of the fact that once analysis is being done, the genre classifications will have to become a fixed set and can desync with the genres presented by the database.
Page 13 talks of the community repositories in generalities, and the service used in the study is only specified in the method-section. It then remains unclear if the genre classifications on Internet Games Database are controlled by developers, service admins, the public, or some combination of these. I read it as implied that the study will not consider “themes”, only “genres” within the service, but I wish it was explicitly stated and the definitions considered. The categorization used by IGDB can be questioned and the choice to include or exclude these different layers of classification should receive careful justification despite the already explored limitations of all genre categorization.
Method:
It would be interesting to see both details and analysis of the Nintendo data (Table 1). What is the exact definition (or list) of “close partners”? The note on the sales-dominance of 1st party games is valuable, but increased analysis could help the paper show how the lack of 3rd party data could affect, for example, genre-related data. This consideration would increase data validity and transparency.
Page 14 has the only mention of “game modes” in the paper. Such data is not mentioned in Table 1, so I deduce that this data will be extracted through the surveys. Details on the extraction and relevance of this data would be useful.
Exclusion Criteria and Missingness
Consider if additional work into understanding and excluding false hours from digital tracking could help increase validity and reflexivity. While technical problems and system clock manipulation are concerns, I would also consider the risk of “idle hours”, leaving games on while not actively playing. Ensuring that self-reported and digital trace data correlate is a good way to mitigate this risk, but I didn’t see specifications on how closely the hours should correlate to be considered valid.
--
I considered the report novel, well-written, and impressive in its depth and scale.
The submitted programmatic registered report provides a plan of a large-scale study with three particular studies focusing on three broad effects of video gaming. As typical for this type of RR, the theoretical introduction is quite general presenting rationalization for quite heterogeneous research aims and hypotheses, so it can be used in future manuscripts only in case of further elaboration.
I see three main strengths of this RR. This first one is using real-time first-hand data, when the agreements with the VG providers allow for this. The second one is a sample, that seems as suitable for the planned analyses, although power analysis has not been conducted (operating by various constraints). Next, I appreciate the longitudinal data collection with focus on mental health, that allows investigating real-time changes.
The hypotheses are clear and justified, I only have a comment to H2a and H2b in Study 3, which are not directional and it is not completely clear how they will be evaluated.
Based on the above, I have almost no reccomendations for the authors. The planned instruments, process of data collection and statistical analysis seem as appropriate for the stated aims and hypotheses.