DOI or URL of the report: https://osf.io/nq9yx/
Version of the report: v1
Dear Dr. Zhang,
Thank you for revising your stage 1 report entitled “Is the past farther than the future? A registered replication and test of the time-expansion hypothesis based on the filling rate of duration”. I sent the revised version to the reviewer who had previously commented on your use of the of the measurement o the filling rate of duration.
I would like you to address their suggestion in a further minor revision of the stage 1 report. I would also like to expand on the reviewer’s comments.
You write that, “Analogous to the stimuli and tasks used in research of short durations, in longer durations we assume that how much the duration is filled with events, which we refer to as the filling rate of the duration in the present study, will also have an influence on psychological distance. The filling rate of the duration is a function of the number of events and the length of each event in the past and future.” However, this is an untested assumption and is also not backed up by any theoretical considerations, at least not in the current version of your report. It seems to me that it would be important to first establish that people take both aspects of durations into account before this can be used as a manipulation in this study. You further write (later on), “The filling rate of duration in our study is not only the number of intervening events in the duration, but also the length of each event (see Supplementary Information). Moreover, there must be events that we have actually experienced in the past (will experience in the future), even if they are not listed. What we focus on in our study is the event, which we have actually experienced or will experience in the future, and its length. In this respect, the focus of our study is different from that of Caruso et al. (2013).” Without knowing that this is how participants do indeed experience durations and also that they are able to take to into account both aspects when answering questions about durations, your measure rests on many assumptions. Nowhere does the reader receive information on whether it is known that participants actually pay attention to both number and duration of each event when assigning fullness to a duration, and how these two aspects interact in such assignments. In this situation, the reviewer’s advice seems very useful. However, it seems critical to also address the lack of theoretical or empirical justification for the assumptions mentioned above (or provide it) as well as adjust the text throughout the report where appropriate in regards to the new measure if you decide to pursue the reviewer’s suggestion.
I hope that you decide to thoroughly address this issue in what I hope is the final revision that this stage 1 report requires.
I thank the authors for clarifying how they intend to account for the length of intervening events in their methods. However, by including only a single question about filled rate of duration (for both past and future), I do not see how they can disentangle the potentially separate contributions of length and absolute number of intervening events. I would therefore recommend including two questions for both past and future: one about the length of intervening events (as currently included), and one about the absolute number of intervening events.
DOI or URL of the report: https://osf.io/w45bf/
Version of the report: v1
Dear Dr. Zhang,
The revised version of the stage 1 report entitled “Is the past farther than the future? A registered replication and test of the time-expansion hypothesis based on the filling rate of duration” has now been assessed by two reviewers, one of whom had also assessed the original submission.
I agree with both reviewers's comments and queries, and would like to invite you to address these in a further, minor revision.
I appreciate the changes you made to the report to explain the third experiment by Caruso et al. and the differences to the proposed study. Upon reading the revised report I had a query, which has also been flagged up by one of the reviewers. This is to do with the fact that you write that “the filling rate of duration is a function of the number of events and the length of each event in the past and future” without providing supporting references or other arguments or models for this statement. Accordingly, and this is what the reviewer flagged up, you seem to imply that the filling rate as assessed in your study is comprised of both frequencies and durations of events, however this is not what your design captures. Therefore, I would like to invite you to address this carefully in both places.
The authors have responded very thoroughly and impressively to my first review. The design is now much clearer to me (and in my opinion stronger), and the manuscript is just about ready for Stage 1 IPA.
I have only one further question, related to the revisions that have now been made. For H2-1 and H2-2, the authors propose analysing only a sub-sample of the total sample size based on a power analysis suggesting that N=102 and N=386 would be sufficient, respectively. This power analysis is appears to be based on a generic d (or dz) of 0.4, as advocated by Brysbaert (2019 -- note that this reference is not in the reference list, so I am not sure which article exactly it is referring to; on an additional minor note, please also ensure that you use "dz" for the within-subjects effect sizes and "d" for the between-subjects effect sizes).
Possibly I am missing a key point here, but my main question is: given the generic nature of the 0.4 effect size estimation, why not take advantage of the total sample size in each study to also test H2-1 and H2-2? It is true that H1 requires a much larger sample size than H2-1 and H2-2, but regardless it seems a shame to leave all of that additional data unanalysed when it would be diagnostic about the predictiosn, and when including this extra data will simply have the benefit of making the statistical tests for H2-1 and H2-2 more sensitive to effects smaller than 0.4 (which I assume would still be theoretically relevant). The authors could report an a priori sensitivity power analysis reporting what effect size they have 0.95 to detect for H2-1 and H2-2 given the full sample size (presumably much smaller than 0.4).
My only other comment is that in the study design table, the content of the columns "Rationale for deciding the sensitivity of the test for confirming or disconfirming the hypothesis" and "Interpretation given different outcomes" should be adjusted slightly to fit requirements. The column "Rationale for deciding the sensitivity of the test for confirming or disconfirming the hypothesis" should provide a justification of the smallest effect size of interest and power level for each hypothesis test, rather than a description of what significant differences would indicate (as currently). The existing content for this column should instead be combined with the existing content in the "Interpretation given different outcomes" column and then included solely within the "Interpretation given different outcomes" column.
Minor: in the third column of the study design table there is an inconsistency in line spacing.
The study seems well designed to test the intriguing and plausible alternative hypothesis raised by the authors. The hypotheses and methods - ostensibly the most important aspects of a registered report - are very clear and reasonable. I only have a few queries and comments (point 3 being the most important):
1. The authors identify previous attempts to explain the TDE from perspectives other than spatial movement (Gan et al., 2017; Mrkva et al., 2018; McCormack et al., 2019), but don't describe these alternative explanations in any detail. It would be helpful to further clarify these explanations in order to illuminate the novel contribution of the authors' own hypothesis.
2. The authors write that the past comprises both "predetermined and sudden events", but it would seem that "expected and unexpected events" would be more appropriate in this context.
3. The authors claim that, unlike in previous studies, they will focus on the length of intervening events as well as the absolute number of events. Yet, their Likert scales don't seem to capture event length in any precise manner. How is this factor being incorporated into the current study, and how does this differ from previous studies?
4. It is unclear why the authors would limit their analyses to a small fraction of the sample when testing some of their hypotheses. Why not include the full sample to get a better estimate of the true effect? Perhaps there is a good reason for such a statistical practice, but if so the authors should clarify it.
5. The sample will include residents of Japan, whereas the Caruso et al. (2013) studies seems to have included American undergraduates and M-Turk participants. Are there any potential cross-cultural differences in time perspectives that could produce differences in the findings, independent of the authors' hypothesis? Perhaps this is simply an unavoidable limitation that would need to be addressed in the Discussion.
DOI or URL of the report: https://drive.google.com/file/d/1bAYnbCSt1vBUHGL-u_s8UMXOwrwMEe2c/view?usp=sharing
Dear Dr. Zhang,
The stage 1 report entitled “Is the past farther than the future? A replication and time-expansion hypothesis based on event frequency” has now been assessed by two reviewers.
Both reviewers raise important questions about the theoretical background for your study as well as about the methodology proposed for this study. This includes, but is not limited to, Reviewer 1’s request for more background information regarding the ‘event frequency’ hypothesis and Reviewer 2’s request to provide more information on how this hypothesis has been discussed and tested by Caruso and colleagues. Caruso et al. (2013) consider alternative explanations in their paper and in that section report an empirical study in which they test whether listing tasks did not increase but decreased the psychological distance for a date 3 weeks in the future relative to a control group with no listing. Your stage 1 report would benefit from discussing these results and also discussing how your way of testing the ‘event frequency’ hypothesis differs from this previous one (and where the benefit is in testing it directly in a comparison between a future and past time points). Related to this part of your study, Reviewer 1 also suggested that a within-subject design may provide a stronger procedure in testing this hypothesis.
Both reviewers also have additional questions/comments regarding methodological and sampling aspects that would need to be addressed at this stage. Based on all these comments, I would like to invite you to carefully revise your stage 1 report.
This is a signed review (Chris Chambers, Cardiff University).
In this Stage 1 submission, Zhang et al propose an interesting pair of experiments to test whether a phenomenon termed the temporal doppler effect (TDE) – in which people perceive events in the past to be further from the present than events at an objectively equivalent point in the future – might be explained by a process related to the filled-duration illusion (FDI) – in which periods of time with a greater frequency of events or changes are perceived as being longer in duration.
Over two experiments, the authors propose to replicate the TDE, and further, to test whether the magnitude of the TDE is related to subjective judgments of event frequency.
My overall evaluation of this proposal is positive: the research question is clearly articulated and the hypothesis offers a potentially elegant test of an underlying cause of the TDE – assuming the TDE itself replicates successfully, which the authors will also test.
My main comments concern the justification of the link between the TDE and FDI, whether the proposed studies provide a sufficiently severe and falsifiable test of the overarching hypothesis, and the degree of methodological detail. I summarise these points below and have attached a commented version of the Stage 1 manuscript which expands these points and places them in a specific context (along with some additional suggested revisions).
1. The authors’ hypothesis that the TDE is related to (and possibly driven by) same/similar mechanisms to the FDI depends on an auxiliary assumption (tested in H2-1) that people will report a greater frequency of events in the past than they will anticipate in the future. I may have missed this, but I couldn’t find any basis for this hypothesis, either based on theory or previous evidence. As noted in my in-line comments in the manuscript, I feel this issue needs more attention and foundation.
2. It wasn't clear to me whether H2-2 (event frequency related to psychological distance) will be tested through separate correlations in past and future conditions, or through the data collapsed across these conditions. This in turn led me to wonder whether the link between the TDE and event frequency is being tested in the most severe way possible given the current between-subjects design in which past and future judgments are made by separate groups. As noted in my in-line comments, an augmented design could involve the inclusion of past and future judgments in the same participants (in a counterbalanced order) – this would allow for the calculation of a Past vs Future difference score which could then be related to the polarity and magnitude of the TDE. In addition, by selectively analysing the first session of the counterbalanced sequence, the authors could test H1 in a comparable between-subjects manner to Caruso 2013; and by taking into account the difference in event frequency between Past vs Future, the authors could establish a tighter link between the measures. For example, would participants who show a robust opposite trend in event frequency between Past vs Future also show an opposite TDE? If so, this would provide strong support for the authors' overarching proposition. If not, it seems to be it would also provide a more severe and convincing disconfirmation.
3. Throughout the manuscript, there are various points where greater methodological detail needs to be provided, e.g. concerning sampling, exclusion criteria, and instructions to participants. I was also left wondering whether the survey question should include attention checks to protect against inattentive or random responses. Finally, I felt that the rationale for both the 1-month and 1-year studies needs strengthening, over and above the fact that the studies replicate the approach taken by Caruso 2013.
The attached in-line comments include expanded discussion of the above points and some additional minor points, including some suggested language edits.
Download the review