Grateful or indebted? Revisiting the role of helper intention in gratitude and indebtedness

based on reviews by Jo-Ann Tsang, Sarahanne Miranda Field and Cong Peng
A recommendation of:

Revisiting the Effects of Helper Intention on Gratitude and Indebtedness: Replication and extensions Registered Report of Tsang (2006)

Submission: posted 12 January 2023
Recommendation: posted 16 January 2024, validated 17 January 2024
Cite this recommendation as:
Chen, Z. (2024) Grateful or indebted? Revisiting the role of helper intention in gratitude and indebtedness. Peer Community in Registered Reports, .


When receiving a favour, we may feel grateful and/or indebted to those who have helped us. What factors determine how much gratitude and indebtedness people experience? In a seminal paper, Tsang (2006) found that people reported feeling more gratitude when the helper's intention was benevolent (e.g., helping others out of genuine concerns for other people) compared to when the helper's intention was perceived to be selfish (e.g., helping others for selfish reasons). In contrast, indebtedness was not influenced by perceived helper intention. This finding highlighted the different processes underlying gratitude and indebtedness, and also inspired later work on how these two emotions may have different downstream influences, for instance on interpersonal relationships.

So far, there has been no published direct replication of this seminal work by Tsang (2006). In the current study, Chan et al. (2024) propose to revisit the effects of helper intention on gratitude and indebtedness, by replicating and extending the original studies (Study 2 & 3) by Tsang (2006). Participants will be asked to either recall (Study 2) or read (Study 3) a scenario in which another person helped them with either benevolent or selfish intentions, and rate how much gratitude and indebtedness they would experience in such situations. The authors predict that in line with the original findings, gratitude will be more influenced by helper intention than indebtedness. To further extend the original findings, the authors will also assess people's perceived expectations for reciprocity, and their intention to reciprocate. These extensions will shed further light on how helper intention may influence beneficiaries’ experiences of gratitude and indebtedness, and their subsequent tendencies to reciprocate.

This Stage 1 manuscript was evaluated over two rounds of in-depth review by three expert reviewers and the recommender. After the revisions, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA. 
List of eligible PCI RR-friendly journals:

1. Tsang, J.-A. (2006). The effects of helper intention on gratitude and indebtedness. Motivation and Emotion, 30, 199–205.

2. Chan, C. F., Lim, H. C., Lau, F. Y., Ip, W., Lui, C. F. S., Tam, K. Y. Y., & Feldman, G. (2024). Revisiting the Effects of Helper Intention on Gratitude and Indebtedness: Replication and extensions Registered Report of Tsang (2006). In principle acceptance of Version 3 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #2

DOI or URL of the report:

Version of the report: 2

Author's Reply, 11 Jan 2024

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to:, updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R 2"

Decision by , posted 05 Sep 2023, validated 07 Sep 2023

Dear Dr. Gilad Feldman and co-authors,

Thank you for submitting your revised Stage 1 Registered Report “Revisiting the Effects of Helper Intention on Gratitude and Indebtedness: Replication and extensions of Tsang (2006)” to PCI Registered Reports.

I have now received comments from three reviewers who have also reviewed the original RR. We are overall all satisfied with the revisions that you have made, and think that the revised manuscript has improved much. Two reviewers have provided some further minor comments, and I too have noted some small issues while reading the manuscript again myself (see below). I wish you good luck with adressing these final comments, and I look forward to seeing the revised RR.


Kind regards,

Zhang Chen


Minor comments:

Page 9: “In line with the difference in action tendencies between gratitude and indebtedness, they also found that participants were more likely to express willingness to return the favor if the benefactor did not communicate strong reciprocation expectations.” You mentioned above that “gratitude leads people to thank their benefactor, whereas indebtedness leads people to try and return the favor”, and that “higher expectations resulted in decreased gratitude yet increased indebtedness”. Combining these two findings, I would expect that no strong reciprocation expectations -> increased gratitude and decreased indebtedness -> reduced willingness to return the favor, thus opposite to the claim in the first sentence?

Table 1: Hypotheses 4, 5 and 4+5 are said to be re-analyses of hypotheses 2, 3 and 2+3, but they do not correspond to each other (4 and 5 are the same).

Table 1: Hypothesis 7c: “Gratitude is higher in Ambiguous condition compared to Selfish condition”, but the effect size Cohen’s d is negative.

Table 1: Is there a reason why hypothesis 7c is about ambiguous versus selfish, whereas hypothesis 8c is about benevolent versus selfish? Later on, 7c and 8c are compared, but the two do not really seem comparable.

Page 19: Please also cite R and G*Power to give credits to their developers.

Page 29: "There were 16 effect sizes calculated from the target study (see Table 2)". Should be Table 1?

There is some inconsistency in how the 'selfish' condition is named throughout the manuscript (sometimes called ulterior). It's better to use a consistent label.

Table 6: It may be good to explicitly mention (again, perhaps under Procedural details) that the target article conducted Studies 2 and 3 separately, whereas here the two studies are combined into one data collection batch.

Page 32, Replication: Extension analyses: I can understand the analyses, but I find the description a bit difficult to follow. It may help the readers if you could explicitly specify which ANOVAs were conducted, something like "mixed ANOVA with intention condition (benevolent versus selfish in Study 2, and benevolent versus selfish versus ambiguous in Study 3) as a between-subject factor, and emotion type (gratitude versus indebtedness) as a within-subject factor, and the reported emotion strength as the dependent variable".

Page 33: Is there a rationale for why the alpha level is adjusted to .005 (in Order effects) and .001 (in Outliers and exclusions)?

Reviewed by ORCID_LOGO, 21 Aug 2023

Dear authors and recommender, 

I have read both the authors' response to reviews and the revised manuscript. I am pleased to see that much work has been done to improve the proposed study protocol, and I am happy with how the authors responded to my comments. I am satisfied that a study based on this protocol would be of sound methodological quality, and would provide enough detail to be reproducible by other parties. I look forward to seeing the study when it is complete. 

Signed, Sarahanne M. Field

Reviewed by , 12 Aug 2023

I appreciate the authors' responses to the reviews, they helped clarify some confusion and concerns I had with the research. Although I am hesitant about the within-subjects nature of the registered report study, as long as the authors collect enough data to examine order effects (which it appears they will), then I have no problem with it. Below are some small suggestions to further improve the research. 

- on p. 17 of the marked document, item #9, it might be clearer if, rather than using the label "selfish intentions" the authors used "benevolent motivations" so that it reads "Ratings of benevolent motivations are associated with gratitude . . ."  This suggestion also applies to item 10, and item 9 + 10. 

- related to the authors' proposed extension looking at indebtedness and gratitude on reciprocity, Nelson et al 2023 also looked at gratitude, indebtedness, and prosociality (

- around p. 20, the authors report sensitivity and power analyses. The authors may want to explicitly state that the number of participants they plan on recruiting are more than enough to provide enough power for moderation analyses looking at order effects on the presentation of Study 2 vs. Study 3. 

- on p. 26, on the item "Study 2 and 3 are conducted seperately", many of the points raised in the "Reasons for change" column are unrelated to reasons for changing the original study to within-subjects. For example, the reasons "To reduce the order effect" and "to avoid the influence of decline to particular studies" are not the reasons why the authors moved to a within-subject design (although they are reasons for counterbalancing). The only reason in that section that explains a change from the original study is the statement, "to find potential consistency within participants’ answers (whether an answer is predictive of another answer". The other reasons, while good reasons for counterbalancing, are not relevant to the change from the between Ss nature of the original studies, and the within-Ss nature of the proposed study.

- on p. 37 on the table for the classification of replication, would it be appropriate to include in procedural details the fact that participants will see both Study 2 and Study 3 stimuli in the new research?

- on p. 46 the title for Figure 1 needs to be fixed.

- on p. 47 the title for Figure 2 needs to be fixed.

- In the authors' response letter, the third comprehension check for Study 3 appears to have incorrect answers. "a" (I know without doubt it is because my friend wanted to borrow my car.) should be selfish, "b" (It is not clear about the two being related, but the weekend after helping me this friend asked to borrow my car) sho- on uld be ambiguous, and "c" (My friend is really concerned about me​) should be benevolent. 

I look forward to seeing the authors' completed research. 

Reviewed by , 04 Sep 2023

Thank the authors very much for clearly addressing my concerns in detail, which is impressive. I think the current manuscript is much more improved. I still have one small concern about the extension with Watkins et al (2006). I wonder whether perceived expectation for reciprocity is also a manipulation check rather than a DV. In Study 3 in the Selfish condition, the intention seems out of expectation for reciprocity. And in the vignette it actually has explicitly mentioned the higher expectation for reciprocity: “in order to borrow your car next weekend”.
Besides, some minor points. I saw a list of α in the Measures. What does α mean here? If they are Cronbach’s α, they seem incorrect (all below 0.1). If they are not, please specify. And Ames et al. (2004) was listed in the text (p.10) but not in the reference lists.
Good luck with your research!

Evaluation round #1

DOI or URL of the report:

Version of the report: 1

Author's Reply, 26 Jul 2023

Download author's reply Download tracked changes file

Revised manuscript:

All revised materials uploaded to:, updated manuscript under sub-directory "PCIRR Stage 1\PCI-RR submission following R&R"

Decision by , posted 03 Apr 2023, validated 03 Apr 2023

Dear Gilad Feldman and co-authors,

Thank you for submitting your Stage 1 Registered Report “Revisiting the Effects of Helper Intention on Gratitude and Indebtedness: Replication and extensions of Tsang (2006)” for consideration by PCI Registered Reports.

I have now received comments from three expert reviewers, including the author of the original paper. As you will see, reviewer 1 (Tsang) and 2 (Field) are generally very positive about the current replication and extension, while reviewer 3 (Peng) is a bit more critical of the literature review and the methodology. All reviewers provided helpful and constructive comments that can be used to further improve the manuscript. Based on the reviews and my own reading, I would therefore like to invite you to submit a revised version.

Both reviewers 1 and 3 have concerns about combining study 2 and 3 within subjects. I think these are valid concerns - combining both studies will make the current investigation less of a 'close' replication, and potentially complicate the interpretations of results. Of course, by looking at the first study only by each participant, you will still be able to examine each study separately, but with only half of the original sample size. Since data will be collected online via Prolific, I think running study 2 and 3 as two separate studies will not increase the total monetary cost and the time needed for data collection. As such, I agree with reviewers 1 and 3 that running study 2 and 3 as two separate studies seems to be a better option.

The manuscript is overall well-written. However, I agree with reviewer 2 that the short paragraphs at some places impede the overall readability. Furthermore, reviewer 3 provided many useful references that can make the literature review more comprehensive, and further strengthen the motivation for the current replication. Please also double-check the revised manuscript to make sure all in-text citations are included in the references, and vice versa.

American students will be recruited from Prolific for this study (Table 2). Please provide more details on how the participants will be selected from the overall population on Prolific (e.g., what pre-screening options on Prolific will be used to recruit student participants from the US). This will help address the comment by reviewer 1, namely the scenario in study 3 is mostly relevant to students, but less so for non-students.

Reviewer 1 also raised another interesting point, namely 200 dollars in 2006 would be worth almost 300 dollars today. When it comes to replications, there may be a tension between using the exact same stimuli, versus using stimuli that have similar 'psychological' meanings. I do not have a clear idea on this. For me, whether the amount matters or not in this case is eventually an empirical question, that can be tested by e.g. giving half of the participants the original 200-dollar version, and the other half the updated 300-dollar version. The amount can be included in the analyses as an extra factor, to (1) examine its potential influence, and (2) test whether the findings hold in both conditions. However, this comes with the cost of making the study less of a 'close' replication. I am curious to hear your thoughts on this.

Below are some notes/thoughts that I had while reading the manuscript myself:

Hypotheses 1 (1a-1c) and 6 in Table 1 are about the correlations between gratitude and indebtedness in different helper intention conditions. It is not entirely clear to me how these correlations may address the core hypothesis in Tsang (2006) and in the current replication, namely “benevolent (versus selfish) intentions were more strongly associated with gratitude than with indebtedness” (Page 12). I can see how the results of ANCOVA and regressions can answer this question, but I am not sure which pattern of correlations between gratitude and indebtedness would support or refute the core hypothesis. I may have missed it, but the introduction also does not discuss the correlation between gratitude and indebtedness. If the correlation between gratitude and indebtedness is an important piece of evidence for the core hypothesis, you may need to discuss this more explicitly and extensively in the introduction, especially on how it is related to the core hypothesis. However, if the correlation is not central, it may be better to move them into the supplementary materials, or make a distinction between primary vs. peripheral tests.

For the regression analyses, it is not entirely clear to me what the predictor 'helper intention' refers to, either (1) the different conditions that participants are assigned to, or (2) the perceived helpers’ motivations (i.e., DV 4). If the former, I think the ANCOVAs and regressions are essentially the same, but presented in slightly different ways. Both use helper intention conditions and the magnitude of a favor as predictors, and gratitude or indebtedness as the outcome. Squaring the t value from the regression should give the F value from the ANCOVA, and the p values should be the same for each effect. If the authors can verify that these two tests are equivalent, combining them in the results section will simplify things. Table 1 can also be simplified (i.e., no need to repeat the same predictions twice).

I feel the proposed statistical tests do not provide a direct and formal test of the core hypothesis. For instance, in study 2, the predictions 2+3 (and other predictions involving comparisons between gratitude and indebtedness) seem to rely on a descriptive comparison of effect sizes between both conditions, but not formally tested. One may test this directly, e.g. by using (1) the helper intention condition, (2) the magnitude of a favor, and (3) the type of emotion examined (gratitude vs. indebtedness; within-subjects) as predictors, and the reported strength of an emotion as the dependent variable. Is it correct to say that the core hypothesis would then translate into a statistically significant interaction between factors (1) and (3)? If yes, I think it would be informative to conduct such an analysis, as another 'extension' of the original findings.

I am not sure if I fully get the predictions when combining the findings from Tsang (2006) and Watkins and colleagues (2006) on Page 15. Watkins et al. found that high expectations for reciprocity would increase indebtedness but decrease gratitude. Assuming that "benevolent giving may be associated with lower expectations for reciprocity than selfish giving", my chain of reasoning is that benevolent giving -> lower expectations for reciprocity -> decreasing indebtedness and increasing gratitude. It is unclear to me why "according to the findings by Watkins et al. (2006), benevolent giving may result in more indebtedness than gratitude, the opposite of the predictions by Tsang (2006)."

Please provide more details on potential data exclusion criteria. E.g., do participants need to pass all comprehension checks in order to be retained in the analysis? I wonder if there are other data quality checks. Especially for study 2 where participants have to recall their past experience and type it into open-ended questions - I can imagine some online participants may not be very motivated to do this. Are there any other measures that may be used to filter out low-effort responses, such as extremely fast responses or short answers?

Some minor points:

Page 4: In the abstract, the effect size of helper intention on indebtedness in Study 3 is outside of the 95% CI ("η2p = .14, 95% CI = [0.00, 0.03]").

Page 7: "We then discuss our motivations for the current replication review and review Tsang (2006)...". The first 'review' should be removed?

Page 11: "Especially so given that the target article sometimes theorized using null effect language and concluded no differences from null effects.". This sentence is not entirely clear to me.

Page 16: "Therefore, our extension ties and contrasts the predictions by Tsang (2006) and Watkins et al. to examine how helper intentions are tied." This seems to be an incomplete sentence?

Page 18: "Effect size and confidence intervals were all calculated with Rstudio (Version: 1.4.2)". I think it's important to also report the version of R used - after all, R is doing all the computing, and RStudio is mostly an IDE for R.

Page 19: The planned sample size is inconsistent, being 800 at some places but 1000 at other places.

Page 25: "including questions about what flavor was offered in the scenario". "flavor" should be "favor".

Page 31: "we used correlation tests (Pearson's Correlation) to examine the association between helper intention conditions (benevolent and selfish) and emotions (gratitude and indebtedness)". If I understood this correctly, the correlation tests are to examine the association between gratitude and indebtedness across different helper intention conditions (but see my comment above)?

Kind regards,

Zhang Chen

Reviewed by , 09 Feb 2023

I am excited to see this replication and extension of work on intention, gratitude, and indebtedness. I am glad that the authors are recruiting a bigger sample size, adding more manipulation checks, and measuring the additional dependent variable of reciprocity intentions. Below are some suggestions and questions that I have which I hope will helpful to the researchers in conducting their study. 

- I may have missed it, but I don't think the authors specified why they were replicating Studies 2 & 3, but not Study 1. When reading Tsang (2006), it is obvious that one would try and replicate Study 3 rather than Study 1, but readers might not be familiar with the methods of the original studies and therefore might miss this. 

- The hypotheses in Table 1 (p. 13) that were reworded from the null (1c, 3, 5, 8a, 8b, 8c, 10) were confusing to me. I understand the need to reframe the original null predictions, but the reframed hypotheses were making predictions that the original paper did not make. The combined hypotheses made more sense to me as a reframing. I'm not sure if there is a way to clarify this.

- at the bottom of page 15, the authors theorize about the relationship between intention and reciprocity, but then on p. 16 make predictions about gratitude and indebtedness. this was a little unclear to me; I was expecting the predictions to be about intention and reciprocity given the theorizing. perhaps there is more the authors can say about theory related reciprocity and gratitude/indebtedness before they get to those predictions that will make the argument a little more clear. 

- Method: I see the rationale for running Study 2 & 3 within-subjects, but I am worried that this may subtly influence the results. For example, if a participant is assigned to a benevolent intention condition in Study 2, but then is assigned to a selfish condition for Study 3, this may introduce a contrast effect--selfish favors may seem more selfish after writing about a benevolent favor. Participants who are asked to read a textbook scenario first, might then be influenced by this scenario when they recall their own received favors in the subsequent study. Additionally, gratitude or indebtedness may be primed in ways not primed by the original studies if the studies are run together. Thus, running these studies together might compromise the closeness of the replication. 

- oddly, given my previous comment, I am also worried about the directness of the replication of the scenario study, in that the population from which the authors are recruiting are not all students, and also given the passage of time from 2006 until now. Specifically, the scenario from Tsang (2006) was designed to be relevant to the student population from which the participants were recruited. However, the current authors plan to recruit from a broader population, and it is likely that students will be in the minority in their participant pool. Thus, the scenario may be less relevant to their participants, and induce less gratitude and indebtedness. The amounts used in the original scenario also mean something different today. For instance, $200--the amount lent to the protagonist in the original study--would be worth almost $300 today. Thus, using the same exact scenario today as Tsang used in 2006 would lead to participants reading about a less valuable favor. 

- it makes sense to measure reciprocation intentions as an extension. The authors might look at Peng et al. (2020) to inform predictions regarding gratitude, indebtedness, and reciprocation. 

Reviewed by , 30 Mar 2023

The current paper intended to offer a replication and extension of Tsang (2006) investigating the effect of helper intention on gratitude and indebtedness. Tsang (2006) suggested that perceived benevolent intention would trigger higher gratitude but would not affect indebtedness. Tsang (2006) is indeed a pioneer work and inspired many later researchers to differentiate gratitude and indebtedness. However, there are severe problems of the current manuscript that blocks me from recommending to proceed to stage 2.
First, I think the current literature review on gratitude and indebtedness is very limited and far from comprehensive. There are accumulated literature suggesting the relation-oriented function of gratitude to promote intimate bonds (Algoe, 2012; Algoe et al., 2013; Bartlett et al., 2012; Gordon et al., 2012; Kubacka et al., 2011; Lambert et al., 2010; Ng et al., 2017; Peng et al., 2018; Williams & Bartlett, 2015), which is important to help clarify why benevolent intention is important to trigger gratitude. There are also accumulated literature suggesting the exchange-oriented function of indebtedness (Adams & Miller, 2022; Goyal et al., 2022; Naito & Sakata, 2010; Peng et al., 2018), which is important to clarify why beneveolent intention is NOT associated with indebtedness. Moreover, the current manuscript give me the impression that the authors lack to provide a systematic review of the literature and clear arguments but are listing literature loosely.
Second, the manuscript writing lacks basic scientific rigor. There are many literature presented in the introduction but not listed in the reference, and I could not find them either on google scholar to judge their validity (e.g., Gray et al., 2001 on p7; Ortony et al., 1988 and Mathews & Green, 2010 on p8; Maureen & Jeffrey, 2009 on p9; Ames et al., 2004 and Welsh et al., 2021 on p10). Meanwhile, in many cases, the authors fail to provide reference for certain claims (especially when the claims are big), making it difficult for me to make sense of it. Some examples are: The starting sentence in background on p7: “Gratitude and indebtedness are common emotions in response to receiving help. But studies suggested that they are experienced differently depending on situation”. And on p8 line 5, “These two emotions have often been equated in the early literature, yet evidence showing that these emotions are elicited in different situations suggested the need to differentiate them.”).  
Third, the current replication is making things too complicated to be qualified as a replication. I wonder why the authors considered to combine two separate studies in Tsang (2006) into one study rather than replicating them separately. This is not a replication anymore, as the design in either study may affect the other. For whatever results in the end, it would be very hard to interpret and compare with the original study. Let alone the authors are extending it to mix with the design of Watkins et al. (2006), making it even further from a replication. I think a good replication design should be as close and comparable to the original study as possible.

Reviewed by ORCID_LOGO, 08 Mar 2023

Dear authors, 

Thank you for giving me the opportunity to review your work. My apologies for the lateness of my review. I hope my comments are nevertheless helpful. 

I am generally very positive about this Stage 1.

- The replication protocol is very clearly set out, and it appears to me as though, providing the replication study is conducted closely to how it has been described here, the replication has a very good chance of reinforcing the effects in question, should they 'exist'. Importantly, I think this replication study protocol leaves little room for flexibility or bias, which is important for meaningful and high-quality replication studies. 

- The theoretical background is clear and follows logically, and motivates the study sufficiently. The target sample size is motivated also, and seems reasonable.

- The planned statistical approach seems appropriate to me (although I will freely admit that I am no expert on these kinds of analyses). 

- Although I typically suggest using Bayesian methods to compare replication targets (as they allow one to quantify pro-null evidence), I am interested to see how the authors use the LeBel method for this study. 

- Manipulation checks and controls seem sufficient for purpose, from what I can tell. All in all, I look forward to what the results show and whether Tsang's original findings are supported. 

I have only three tiny quibbles (in no particular order):

1. This article, while generally clearly written and free of obvious typing errors, is plagued by very tiny paragraphs. I'm not a nit-picker usually, but these paragraphs are so small as to be distracting and sometimes make the reading more difficult than it should be. Particular examples of where paragraphs could be merged are on pg 11 ("We chose..." might be merged with "The article has...") and pg 12 ("Tsang (2006) examined..." might be merged with "We focused our...") and so on. This isn't a huge deal-breaker, but readability would be improved, in my opinion, if the structure of the article were revised with this in mind. 

2. I find some of the figures a little hard to read. Violin plots are great and the figures generally look very good, however the raw data points on some of them are quite large and very transparent, which makes the distributions hard to see underneath the boxes. Figures 3, 4 and 8 for instance are great - the data points in those are smaller and you can clearly see the way the data are distributed, however Figures 5 and 6 (etc) are harder to make out. Not a huge issue, but given that the data in some of the figures seems quite evenly distributed, it's harder to see the distributions' nuances. 

3. The figures are numbered weirdly, unless I'm missing something. It appears as though there are two Figure 8s? 

As I mentioned above, this replication study plan looks very well thought-out to me, and will be a reasonable 'test' of Tsang's Study 2 and 3 (to the extent that a single replication can be, of course!). I wish the authors luck with the data collection, should they get the go-ahead!

I sign every review I undertake,

Sarahanne M. Field

User comments

No user comments yet