Dear authors
thank you very much for the thorough revisions. All my concerns have been adequately discussion and - where necessary - incorporated in the revision. I wish the authors all the best for their data collection.
Sebastian Berger
Dear editor and authors,
I went through the author replies and revised materials, and I am satisfied with their work.
The authors addressed all my comments adequately and I think they are ready to proceed
with the study. I look forward to see the results.
Kind regards,
Dr. Ben De Groeve
I would like to thank the authors for their thorough and thoughtful responses to my comments. They addressed all of them and the manuscript has further gained in clarity. I happily recommend in-principle acceptance and wish the authors all the best with regard to their data collection. I look forward to the results.
Florian Lange
I think that the authors have addressed my concern and testing each Study seperately in different halves of the sample makes sense.
DOI or URL of the report: https://osf.io/q3v8u/
Revised manuscript: https://osf.io/gs4u3
All revised materials uploaded to: https://osf.io/h2pqu/, updated manuscript under sub-directory "PCI-RR submission following R&R"
I have now received four very helpful expert reviews of your Stage 1 submission. In general, the reviewers find substantial merit in your proposed replication and the signs are promising for achieving Stage 1 in-principle acceptance (IPA) in due course. There are, nevertheless, a number of significant issues to address to satisfy the Stage 1 criteria.
One of the major concerns raised by the reviewers is the decision to combine study 1 and study 2 into a single survey using a within-subjects design. The reviews query this deviation from the original methodology and the risk of introducing order effects (despite randomisation) that may introduce ceiling effects (in at least a sub-sample) or otherwise hamper replicability and validity. One solution to this problem that you may want to consider would be to double the sample size, running 50% of participants in one order and 50% in the opposite order. You could then analyse the first session in a between-subjects (closer) replication of the original study while also using the total dataset for any within-subjects analyses.
The reviewers also raise substantial concerns about beliefs in animal emotion and cognition having changed since the publication of the original study 10 years ago; sampling bias due to the way in which “meat-eaters” are selected in the inclusion criteria (deviating from the original study); the validity and implementation of the manipulation check; and the structural organisation of the manuscript, which at least one reviewer felt didn't provide essential methodological detail in the best place.
This summary of issues is far from comprehensive, and you will find enclosed in the reviews a variety of other points -- often accompanied by constructive proposals for solutions. On this basis I am happy to invite a Major Revision and hope you will find these reviews as useful as I have found them.
I have read over everything and it all looks good to me. There is only the concern that I have, and which I think is quite substantial. Why are the authors combining Studies 1 and 2 into a single survey? This seems to be quite problematic to me. If I am thinking about the minds and edibility of animals, and then I am thinking about a sheep or cow in a paddock, won't the concept of eating animals be primed for me already and potentially impact on how I think about that animal? I think it would be a fair criticism to say that this design could undermine a clean experimental manipulation - comparing thinking about animals as food vs. not as food.
The other direction, where Study 1 replication comes second would also be impacted. As people who have read about animals being harmed in the meat production process may be motivated in their ratings of edibility and mind.
I think this a fair criticism of the suggested approach and would likely undermine the validity of the replication attempt.
Brock Bastian
Dear Chris Chambers and authors,
I am pleased to review the Stage 1 Registered Report entitled "Revisiting the motivated denial of mind to animals used for food: Replication and extension of Bastian et al. (2012)" in accordance with the Stage 1 criteria listed in the Guide for Reviewers of the PCI initiative.
1A. The scientific validity of the research question(s).
Given the reproducibility crisis in psychological science and the current evidence showing a motivated denial of food animal minds, the proposal to replicate Studies 1 and 2 of Bastian et al. (2012) using highly powered samples is theoretically justified. The research questions are clear and summarized in the PCIRR-Study Design Table on p.5 and are answerable through quantitative research. The scientific validity of the study is covered in the introduction.
1B. The logic, rationale, and plausibility of the proposed hypotheses, as applicable.
In the introduction, the authors provide a rationale for replicating the original findings of Bastian et al. (2012). The proposed hypotheses are precise and summarized in the PCIRR-Study Design Table on p.5, though it seems like the H1b (moral concern) and H1c (negative affect) have been switched. In addition, in Table 1 on p.9, H1b refers to negative affect and H1c to moral concern, while in the text on p.9 the findings are mentioned in reverse order. I recommend to improve consistency in presenting/ordering the hypotheses and associated research questions.
1C. The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis or alternative sampling plans where applicable).
The authors will gather a well-powered sample and provide a clear summary of their study design (Table 3, p.14). Attention is payed to critical design features such as rules for inclusion and exclusion, randomisation and reducing survey fatigue. The authors also did an analysis with simulated data.
Overall, the methods seem feasible, but I question the soundness of three design aspects.
(1C-1) Contrary to Bastian et al. (2012), the authors plan to exclude vegetarians and vegans in the beginning of the survey with the following item: "We are running a replication of a classic study
meant for those who eat meat. Therefore, this survey is only for those who self-identify as meat
eaters. If you are not a meat-eater (e.g., a vegetarian or a vegan), please return the HIT now.
Please indicate: Do you eat meat? with options “Yes, I eat meat” and “No I do not eat meat”."
However, people rarely explicitly identify as a meat-eater because eating meat is the norm, so if participants read that the survey is only for those who self-identify as meat-eaters, they might not feel it is about them or they might feel pigeonholed. If the survey is meant for those who eat meat, I don't think it is necessary to explicitly refer to the meat-eater identity. More importantly, the item might prompt participants to think that (not) eating meat is an essential aspect of the study and previous research suggests that a mere exposure to vegetarians might arouse meat-related dissonance (Rothgerber, 2020). My concern is that this item might confound/influence the results of the study because, for example, participants may be more likely to see the non-food animal condition in Study 2 as having something to do with the fact that they eat meat and consequently see the non-food animal more like a food animal. In Bastian et al. (2012), vegetarians were identified and excluded at the end of the survey, so participants' dietary identity was less likely to be salient. To avoid this confounding risk, authors could use the same approach as Bastian et al. (2012) and/or use a prescreening tool to select people who eat meat unbeknownst the participants. I know this is possible in Prolific; I don't know about MTurk.
(1C-2) The design of Study 2 largely corresponds with the design of Bastian et al. (2012), but the original design is rather complicated. What is the reason for the within-subjects design and for comparing the mental capacity of a non-food sheep with a food cow (or of a non-food cow with a food sheep)? If we accept the within-subjects design, I speculate that participants would probably respond more consistently if they have to assess the mind of the same animal twice (i.e., non-food followed by food condition) so that it would be more difficult to find an effect. This might arguably be circumvented by using different animals, though this rationale, or the rationale for a within-subjects design, is not mentioned in the current manuscript. In other words, I think the experimental set-up of Study 2 requires more justification.
(1C-3) Contrary to Bastian et al. (2012), the authors include manipulation checks in Study 2, which helps to ensure whether participants read the scenarios carefully. Nevertheless, I want to caution the authors that the manipulation check of the non-food condition might also affect participants' response to the food condition and make it easier for them to correctly guess the hypothesis related to Study 2. Potential options include pretesting, forcing participants to stay a small amount of time on the pages, avoiding to reveal the food condition in the non-food manipulation check and/or removing/replacing the non-food manipulation check.
Relatedly, what would happen to participants who fail the manipulation check? In the procedure (p.19), the authors write that participants have to answer correctly to continue. Will participants be informed about this in the consent form?
1D. Whether the clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures and analysis pipeline and to prevent undisclosed flexibility in the procedures and analyses.
The authors provided a clear design summary table (p.5), though some methodological aspects were less clear to me.
(1D-1) Related to point (1C-2) above, Bastian et al. (2012) write in their results of Study 2 that "mean mental capacitity ratings were calculated for each animal and each condition. Participants’ ratings of sheep and cows did not differ within either condition so we collapsed across versions. This yielded two animal types: food animal and nonfood animal." (p. 250). This implies that they originally did another test before they performed the t test. I suspect it was an ANOVA with a between-subjects factor (species: sheep last/first vs. cow first/last?) and a within-subjects factor (non-food vs. food), but this is not mentioned. Will the authors also test this between-subjects effect? How would the authors respond if participants' ratings of sheep and cows do differ? What would be the consequences for interpreting the results? In short, I think the experimental set-up of Study 2 requires more metholodical detail.
That being said, I laud the authors for their personal correspondence with the first author to confirm a description error concerning the t test.
(1D-2) The authors combined the two studies into a singular data collection (displayed in random order and with minor adjustments) and mention that this design allows "to both test the designs of the original studies, and to run further tests in comparing the effects of the different studies with the potential of additional insights" (p. 10). I wonder which further tests the authors had in mind because I did read anything about this resolution in the method section.
(1D-3) Concerning the attention checks, the exclusion criteria are rather vague; the authors merely note that failing the checks could be reasons for exclusion, so I recommend more clarity here.
As a sidenote, it might not be clear to participants whether "last week (p. 17)/month (p.26)" includes their participation in the questionnaire. Nevertheless, if the participants pay attention the chances seem high that they select "Used a computer, tablet, or mobile phone" anyway, so this does not seem to be a big issue. In addition, the first answer option (p.26) is in the present tense ("run a marathon") unlike the other options which are in the past tense.
(1D-4) In the supplementary material, I think the sheep in the picture is incorrectly described as a lamb. I recommend to resolve this inconsistency.
1E. Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls; other quality checks) for ensuring that the obtained results are able to test the stated hypotheses or answer the stated research question(s).
To ensure high data quality, the authors plan to include two attention checks and manipulation checks (though see points 1C-3 and 1D-3 for criticism). The authors will also employ a Qualtrics fraud and spam prevention measures (e.g., reCAPTCHA, etc.) and several CloudResearch options in MTurk (i.e., Duplicate IP Block, Duplicate Geocode Block, Suspicious Geocode Block, Verify Worker Country Location, Enhanced Privacy, CloudResearch Approved Participants, Block Low Quality Participants, etc.), though it is not clear to me what these CloudResearch options mean.
Bastian et al. (2012) write in a footnote that they also gathered a sample of vegetarians to examine whether vegetarians lack a motivated denial of food animal minds. Including such a control in Study 2 could strengthen the theory (i.e., assess whether motivated denial only occurs in meat-eaters), though the design of the study would admittedly become more complex.
Kind regards and good luck with the study.
Reviewer
The authors propose a Registered Report of two of the studies included in Bastian et al. (2012). They combine a correlational study and a within-subject experiment into one online survey. Apart from this combination, the proposed methods are very close to the original study and any deviations are clearly identified and motivated. Other strengths of the proposal include the very good rationale for conducting a replication, the compelling power analysis and sampling plan, and the transparent reporting of all materials. If I had to criticize something, it is that the information and materials are at times a bit scattered throughout the multiple parts of the manuscript and supplement. As a result, I have been confused about, for example, which exclusion criteria will be applied in confirmatory analyses. Please find my remaining comments below. I hope they are helpful in revising this manuscript and I am looking forward to seeing the results of this high-quality Registered Report.
P6: I would revise the very first sentence. Who is “we” here and what is meant by “care”? Without this being clear, the reader cannot judge if “we care for animals” is truly a “fact”? Perhaps it is rather that some people express appreciation of animals yet eat them. I also do not think that it would be reasonable to say that such an observation “demands explanation”.
P6: The term “moral patients” is unclear and so is what it would mean for human-animal relationships to be “legitimate”.
Paradox framing: I can see how framing so-called attitude-behavior gaps like the present one as “paradoxes” can seem appealing, but actually, this does not seem very sound or reasonable to me. The described pattern would only be paradoxical if we conceived of beliefs/attitudes as absolute entities/categorical constructs: people either care for animals (or another attitude object) or not and if they care for them, all their behaviors would need to be directed at expressing this attitude in order not to be paradoxical. I don’t think this is how we treat attitudes in psychological research. We consider them as dimensional (more or less pronounced) and we know that different attitudinal goals can conflict. If people say they “care for animals” and “animals should not be harmed”, but then eat animals, well, I would say their positive attitude/caring for animals was not particularly strong or their behavior was (also) a function of another attitude. This does not seem very paradoxical to me. Against this background, I would like to encourage the authors to rethink if they want to reproduce the paradox framing here.
P7/8: I would not say that the Bastian studies “show that those animals that are perceived to be edible are also likely to be perceived as lacking a mind...” This would be an overgeneralization of their findings which have been obtained under very limited conditions – I don’t think the phenomenon can be considered established yet.
P11/12: I think the calculations and considerations regarding statistical power are very strong and compelling. I would only change the last sentence of that section as it is not clear to me for what “this sample size would be more than sufficient”.
P12: In the participants section, I was missing a description of the sampling process and a priori exclusion criteria. From which population will the sample be a sample = to which population will/can the results be generalized? Will the study be advertised to everyone on MTurk and then filled on a first-come-first-serve basis? Will it be advertised specifically to non-vegetarians? I also stumbled across the “etc” in the MTurk/Qualtrics quality checks. Isn’t it possibility to provide an exhaustive list at this point?
P12: As to the payment part, it is not clear if the data from the 30 participants has led to an adjustment of the payment amount and, more importantly, if these data (which have already been collected, I think) will be included in the total dataset (so that only 970 additional participants will be recruited). Please clarify.
P17: I do not understand the section on Manipulations and its function. This information has already provided elsewhere and if anything, it is more vague here than it has been before. In general, I find the structure a bit jumpy and the logic of the headings not always clear. One factor contributing to this might be that sometimes, the word “replication” is added to the heading (I don’t clearly understand when and why).
P18: I appreciate the careful analysis of between-study differences. I think the removal of the filler task is worth another thought. I do follow the authors’ reasoning (its theoretical significance has not been clarified in the original study, so the original authors probably did not consider it to be very important), but I guess it is not implausible for this methodological detail to play a critical role. The lack of a filler task might both promote consistency in participants’ judgments or amplify the contrast between the two animals. I think it would be worthwhile to contact the original authors about this.
P21: I think the statements regarding the exclusion criteria contradict each other. Here, it says that participants who failed the attention checks will be excluded; on page 17, it says that such participants could be excluded in exploratory analyses; and I think the supplementary materials on page 28 contain the same contradiction. I would encourage listing all criteria that will be applied for the confirmatory analyses here (on page 21), not to refer to the supplement for this, and (potentially) not to refer to any criteria that might be applied in exploratory analyses to avoid confusion.
P21: Please make explicit: will 3 SD outliers (for which variables) be removed?
Results: I think it was not made explicit if results will be tested using one-sided or two-sided tests. Please specify and sorry if I overlooked it. I also did not find the evidence criteria. I assume the authors will consider the replication of Study 2 successful if p < .05 and the difference is in the same direction as in the original, but this should be made explicit. For Study 1, is this a family of hypotheses? What will be concluded if only one or two of the corresponding tests are significant? Will there be correction for multiple testing?
Supplement: I think this needs to be cleaned up. Some of the parts do not seem relevant for and adjusted to the specific study (or I do not understand the relationship). The section “additional information about the study” seems to include methodological information that is missing in the main text. Please integrate.
Signed: Florian Lange
1A. The scientific validity of the research question(s).
My evaluation is that the scientific validity of the research question is high. The present submission is an attempt to directly replicate two studies from a paper by Bastian et al. (2012). The authors of the original study investigate the “meat paradox”. The central idea is that meat takes a prominent place in most people’s diet and is part of their culinary enjoyment. At the same time, people dislike harm done to animals, creating an inconsistency referred to as the meat-paradox (Loughnan et al., 2010). People’s concern for animal welfare conflicts with their culinary behavior. The authors of the original study argue that people are motivated strongly to overcome this inconsistency and that this is achieved through motivated moral disengagement driven by a psychologically aversive tension between people’s moral standards (caring for animals) and their behavior (eating them). In particular, they focus on one disengagement mechanism, namely the denial of food animal minds, and therefore their status as moral patients.
The original paper has been quite influential and has received 476 citations (google scholar, assessed May, 2022). Besides this obvious scientific impact, the underlying theoretical ideas of the paper have been influential with respect to a lot of follow-up research on meat consumption. I therefore think that it is a very suitable target for a replication and the authors of the present manuscript (henceforth: Jacobs et al. (2022)) attempt to subject Studies 1 and 2 of Bastian et al. (2012) to high-powered direct replications. I am reviewing the research project as a Stage 1 RR.
1B. The logic, rationale, and plausibility of the proposed hypotheses, as applicable.
The present submission includes four hypotheses (1a-c, 2). I evaluate them jointly below, as my overall evaluation is very positive.
Hypothesis 1a: Mind attribution is negatively associated with perceived edibility of animals.
Hypothesis 1b: Mind attribution is positively associated with negative affect regarding eating animals.
Hypothesis 1c: Mind attribution is positively associated with moral concern for animals.
Evaluation: Hypotheses 1a-c are straightforward in my opinion. They directly follow from the original research that Jacobs et al. (2022) attempt to replicate. In Hypothesis 1a-c, there is no causal hypothesis, thereby the data will be mute with respect to whether eating an animal causes humans to deny a mind, or whether mind-attributions cause lack of eating. Given the existing research, the correlational hypothesis is sound.
Hypothesis 2: Being told that animals will be raised for food consumption (compared to being told it will live as a grazing animal) leads to denial of mind to those animals.
Evaluation: Hypothesis 2 follows from an experimental design able to assess the causal relationship (eating àmind denial). The hypothesis is straightforward and sound.
1C. The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis or alternative sampling plans where applicable).
The statistical power of the current research design is plausible. The authors have taken great care not to overestimate the originally reported effect size and go to great length justifying their sampling decisions, including necessary checks and balances. They adjust their power using the safeguard method and replace the original effect size with the lower bound of the 60% CI. In addition, the use the smallest effect size from Study 1 and 2 in the original study. Due to a change in the design (fewer trials), they further increase the required sample size to 1,000 respondents. Thus, power should be high enough to detect an effect, if any.
Further, as Jacobs et al. (2022) attempt to recruit participants online via Amazon Mechanical Turk, further steps to assure data quality are necessary. The authors use several measures to secure data integrity. The most relevant measure is the recruitment via Cloudresearch/Turkprime and several measures to employ high data quality.
A remaining risk factor is that the authors may be unable to recruit 1,000 participants within a given time-frame. Due to my inexperience with Cloudresearch/Turkprime, I cannot quantify this risk, but I want to suggest a risk mitigation plan that may include collection of additional data via an alternative research platform (e.g., Prolific), if necessary.
I want to comment on other potential aspects. Note that many aspects raised result from the fact that the authors are attempting a close replication of Bastian et al. (2012), and potential weaknesses may arise from weaknesses that occurred in the original study, rather than in the replication.
Blinding: One issue may arise from experimenter demand effects. Asking about the suitability of various animals that are typically not part of the diet in the target population (e.g., lion, elephant, etc.), it could be that participants easily guess potential research questions. Thus, a threat to internal validity is, in my opinion, that blinding participants to the research hypothesis is not easily possible. A potential solution could lie in asking participants at the end of the survey about the hypothesis or the research question. If such a question is asked, the exclusion protocol needs to be adjusted.
Randomization: The randomization of Study 2 (i.e., the experimental study) seems unproblematic and order effects arising from the counterbalancing should be easily detectable, given the large sample size. However, I fear that a potential risk from unintended order effects may restrict the range of data in Study 1, potentially causing ceiling effects. I think I can most easily describe this risk from my own experience clicking through the survey. When taking the survey, I was first prompted to rate a mole. I ascribed particularly high values in the questions asking for mind-attribution, not having seen the other animals. Then, some animals were presented, which I am ascribing “higher” mind-attribution (e.g., elephant, gorilla, monkey): Hence, I would have produced more variance initially, had I known the range of animals. The same issue may arise with the likelihood of eating. In consequence, I think that the presentation order may strongly affect whether or not support for Hypotheses 1a-1c may be found.
Data exclusion: The data exclusion protocol was presented in a way that is easy to follow. The data exclusion protocol does not incur unnecessary researcher degrees of freedom.
Deviations from research protocol with respect to original work: The authors are very transparent about the deviations in their research protocol. I evaluate all deviations from the original study as plausible and would support the updated study design. The authors do a good job making these very transparent.
1D. Whether the clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures and analysis pipeline and to prevent undisclosed flexibility in the procedures and analyses.
The clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures. Undisclosed flexibility in the procedure and analysis is unlikely to remain undetected.
1E. Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls; other quality checks) for ensuring that the obtained results are able to test the stated hypotheses or answer the stated research question(s).
The research question is a direct (or close) replication. Thus, the authors are constrained in freely deciding upon the experimental design. As discussed above, order effects may inflate the risk of ceiling effects. In general, I encourage the authors to include a mitigation plan for this possibility. Although this does make the stage 1 report a bit more complex, I think that it may help with respect to assessing the replicability of the original result. An alternative could be to run a more conceptual rather than direct replication, should the problem of ceiling effects emerge in piloting.
Other remarks:
On a more general level, I have a concern that beliefs in animal emotion and cognition could have been on the rise since the publication of the original study, thereby increasing the problem of ceiling effects. Many popular science books (e.g., Frans de Waal has published two books that I recall. “Are we smart enough to know how smart animals are?” and “Mamas last hug”) have emerged on the topic. In addition, Netflix has devoted wide-received documentaries to the topic of animal intelligence and emotions, for example “My Octopus teacher” or “Explained: Animal Intelligence”. In addition, animal welfare and vegetarianism is potentially more broadly discussed in the media than in 2010. As a result, a potential non-replication may result from general trends in the society, rather than from the absence of a true effect in in the initial study at the given time point.
I wish the authors of the current submission all the best for their project, Sebastian Berger