This proposal sets out to replicate a recent study on "the Medusa effect", wherein people are perceived as more real, mindful, and agentic, when they were presented as pictures compared to as pictures of pictures (Will et al., 2021, PNAS). In particular, the authors will use the same stimuli and procedures as the original study and replicate two experiments: one testing ratings of realness, agency and experience (Study 1, replicating Exp. 2); and the other testing donations in a dictator game (Study 2, replicating Exp. 5). Should those fail, they plan to conduct Studies 3a-b with new stimuli. While I think the Medusa effect is interesting, I think many aspects of the proposal need to be clarified — including the motivation, important details of the design, and plans for null hypothesis testing. I list these more in detail below, in hope they will be helpful to the authors.
The motivation for conducting the present study is simply that the Medusa effect has never been replicated before (p. 6). While I am in perfect agreement with the authors that the Medusa effect is interesting and important, this rationale of course applies to many (if not most) effects that are just as interesting and important. And actually, the original study already contains multiple direct and indirect demonstrations of the effect, and follows open data practices.
More generally, this is not a direct replication testing the reproducibility of the Medusa effect, because it crucially involves a different sample/culture. This should be highlighted and motivated as a key difference: is there past work suggesting cultural differences in mind perception or abstraction perception? Do the authors have reason to believe this will make no difference?
I had several comments on the discussion of possible outcomes:
- Will the authors take their results to support H1 and H2 only if all three DVs show an effect? What if only two do? Or one?
- The authors only mention the possibility of failure to replicate the original effect ("If neither H1 nor H2 are supported, the reproducibility of the Medusa effect may be problematic.") but do not discuss it further. This might instead be a good occasion to mention differences between the original and current study (e.g., different sample, different recruitment platform, ...).
- It is unclear why limitations of the stimuli may explain disconfirmation of H1 and confirmation of H2. Why would the stimuli matter for H1, but not H2? And why would they explain a disconfirmation, if they are the same as in the original study?
- Another reason why H2 but not H1 may be confirmed is that the Medusa effect has stronger consequences for implicit behavior, while explicit judgments are more variable; and viceversa, H1 but not H2 may be confirmed if the Medusa effect has stronger consequences for explicit (vs. implicit) behavior.
While I empathize with the authors' preference for not conducting studies in the laboratory, I don't think this should be a determining factor in choosing which experiments to conduct (p. 7: "Since the COVID-19 pandemic is still in the process, we chose not to replicate them."), since of course this is an ongoing, constantly evolving situation. And beyond the online/inlab situation, I think the types of controls employed in Experiment 4 are important, and the authors should consider adopting those as well.
For all studies, the authors need to specify a plan in case the main analyses aren't significant, to determine whether the results are null or inconclusive.
- The authors mention that "Similar to Will et al.'s (2021) study, pictorial abstraction is a between-subjects factor."; But abstraction is NOT a between-subjects factor in the original study ("Their task was to rate each of the two people shown in an image", p. 6). This should absolutely be corrected.
- Instead, the original study varied the DV (Realness, Agency, and Experience) between subjects. This should absolutely be implemented.
- Were the instructions directly translated from the original ones?
- Will definitions also be provided as in the original study?
- This sentence confused me: "there will be no strict time limitation so that the participants can [...] take no longer than 5 minutes"
- The authors say they will recruit 564 participants, but the table then mentions "more than 564".
- The table mentions "a paired t-test", but aren't the authors conducting three?
- Again, I am not sure how the quality of the stimuli could explain failed replication of H1, if those are the same stimuli as in the original study.
- What is the attention check?
- Why has the maximum donation amount ($10) been lowered (to 1000 yen) when $10 = ~1500 yen?
- Will et al. also had a rating phase in their Exp. 5; why is this omitted here?
- p. 3: The authors define the Medusa effect as a tendency for people to 'evaluate a “picture of a person” as more mindful than a “picture of a picture of a person”', but peple aren't rating the mindfulness of pictures (L1), they are rating the mindfulness of people (L0) in those pictures. This should of course be clarified.
I found the writing often unclear, and I worry a naive reader unfamiliar with the Medusa effect might have a hard time following. I won't list all of them here, but I will just take the first few sentences from the abstract to exemplify:
- The very first sentence "Pictures play an important role in containing and expressing information related to the human mind" is puzzling since many pictures are unrelated to the human mind (e.g. landscapes); do the authors mean pictures of people/faces?
- The second sentence was also confusing "compositional differences [...] affect the way we perceive the vast amount of information" since it's unclear what the vast amount of information refers to; do the authors mean the way we perceive people?
- The third sentence contains an incorrect definition of the Medusa effect, as per my point above.
- The fourth sentence also confused me, since realness was never mentioned before (and is different from mindfulness; a rock can be real even though it doesn't have a mind); also, it's unclear what 'dimensions' refers to; do the authors mean 'abstraction' or 'compositionality' instead?
- p. 5: "Following the aforementioned prior study, Will et al. (2021) used five experiments " I am not sure what the 'prior study' refers to.
- p. 5: I don't think eyetracking is considered physiological data?