Complexity of Shakespeare’s Social Networks
Using Shakespeare to Answer Psychological Questions: Complexity and Mental Representability of Character Networks
Recommendation: posted 07 February 2024, validated 10 February 2024
Karhulahti, V. (2024) Complexity of Shakespeare’s Social Networks. Peer Community in Registered Reports, . https://rr.peercommunityin.org/articles/rec?id=489
Level of bias control achieved: Level 3. At least some data/evidence that will be used to the answer the research question has been previously accessed by the authors (e.g. downloaded or otherwise received), but the authors certify that they have not yet observed ANY part of the data/evidence.
List of eligible PCI RR-friendly journals:
2. Thurn, C., Sebben, S. & Kovacevic, Z. (2024) Using Shakespeare to Answer Psychological Questions: Complexity and Mental Representability of Character Networks. In principle acceptance of Version 3 by Peer Community in Registered Reports. https://osf.io/6uw27
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
Evaluation round #2
DOI or URL of the report: https://osf.io/s97y3
Version of the report: 20231130_PCI-RR_Stage1_Revision1.pdf
Author's Reply, 06 Feb 2024
Decision by Veli-Matti Karhulahti, posted 06 Jan 2024, validated 07 Jan 2024
Dear Christian Thurn and colleagues,
Thank you for all careful revisions and detailed responses to previous feedback. Two reviewers were able to return to carry out another feedback round and they both were very satisfied with this improved version. They only had a few minor suggestions. I’ll let you consider that feedback in a final revision, but I won’t invite the reviewers anymore for a third round. I agree this version is good and near-ready for IPA.
I have only one comment of my own. This concerns H1.
- Because you're testing a confirmatory hypothesis, it would be good to explicitly justify why you expect a certain outcome in the Study 3 section before H1. Currently you write, "we are interested in how the number of characters in a play relates to the complexity... our goal is to understand the relation of the number of characters to the complexity of networks in theatre plays," which is an exploratory description. But your H1 is a confirmatory test ("We test the hypothesis that the number of characters positively predicts complexity (H1)") so it would be important to briefly recap in Study 3 section why do you predict a positive result.
- It feels to me that your assumptions ("we assume that plays are more likely to be well-received and popular if they make it possible for recipients to follow the narrative") predict the null insted of a positive correlation. I might be mistaken, but it would be worth clarifying what you expect and why.
- Because this is a hypothesis test, and PCI RR is very strict about justifying effect sizes in hypothesis tests, it would be good to have some kind of justification for r < .30s as small effects. I know it’s difficult to think about justification in this context. I also find it challenging to help with this—especially as I lack a comprehensive understanding of Kolmogorov complexity in the present context—but here are some ideas.
· One option would be to seek existing character network data in fiction and see what r=.3 looks like. E.g., in fiction (of any media), what is r=.3 in terms of complexity? Can you effectively separate actual works of fiction by complexity? This could help both you and readers grasp the raw effect size and justify it.
· Another option that comes to mind would be to simulate data with r=.3 and see how the effect size appears in these simulated instances. Being able to pinpoint reasonable raw differences even in simulated form would be better than nothing.
· A third option could be to select one actual play by Shakespeare, provide a description of its character network, and demonstrate a hypothetical raw change of .3 in practice.
· A fourth (meta)option would be to take a Bayesian approach and rely on (non-informative) priors. Some justification would still be nice to have but the basis of the rationale would be less problematic for inference.
· (The same concerns RQ4 but since it’s exploratory it doesn’t matter. Btw, also noticed you don’t mention alpha anywhere in the paper -- is it 5% throughout?)
Some of the above ideas may be unfeasible, so please read them primarily as food for thought; I hope they guide you to the best solution from your own topic-expert position. As a standard note, I refer you to PCI RR evidence thersholds and these two papers by Zoltan Dienes on specifying theoretically relevant effect sizes for statistical hypothesis testing:
I sadly don’t have any good examples from theatre or literature, but if you wish to have practical examples of effect size justification from other fields, contact me and I will seek some from PCI RR archive. If you are unsure about anything else or wish to discuss, you can email (as usual) before submitting.
Reviewed by James Stiller, 05 Jan 2024
Reviewed by Matus Adamkovic , 05 Jan 2024
Evaluation round #1
DOI or URL of the report: https://osf.io/nqf7e
Version of the report: PCIRR-Snapshot_ReplicationBattery.pdf
Author's Reply, 30 Nov 2023
Decision by Veli-Matti Karhulahti, posted 30 Aug 2023, validated 31 Aug 2023
Dear Christian Thurn and colleagues,
Thank you for submitting to PCI RR and your patience with a small delay. I am delighted to have received four reviews from diverse experts, including those of social networks, statistics, literature, and Shakespeare. The feedback is generally positive and I am personally excited to serve as a recommender for this genuinely interdisciplinary work. There are comments that need careful attention, however. I summarize some key points below and add few of my own.
1. A primary worry coming from all four reviewers is that (to synthesize it in my own words) the “scientific goal” of the study is unclear. To be precise, this does not refer to how the research plan is presented — this is exceptionally clear — but rather what is the scientific question that the study wants to figure out. Do you wish to contribute to the theory behind Dunbar’s number? Do you wish to learn more about Shakespeare, drama, or character networks in fictional narratives? As the reviewers point out in different ways, the extended replication will surely yield new useful information, but it is not clear what that means. If the original study replicates or not, what can we deduce from that, theory- or otherwise? Especially because this is a carefully designed RR which allows robust tests of hypotheses and theories, it feels like a lot of potential value can be “wasted” without committing to theoretically risky interpretations. See the next comments for follow-up.
2. Although the MS explicitly says that is not designed to test hypotheses (Bias control), there are several criteria set for different outcome interpretations and in some cases they even lead to falsifying certain theoretical positions (as the four RQs show in the end). On the other hand, this seems like very traditional hypotheses/theory testing, sometimes with clear H1/H0/undecided interpretations. It is a bit unclear how this is different and why it has been separated from hypothesis testing and/or confirmatory work? I will list more detailed examples next.
3. RQ1: “The theory is that Shakespeare’s plays and the ethnographic observation of human group size come from the same distribution.” Indeed, it is clear here that we are curious about similarity, statistically. Now, taking a few steps back, why is this similarity interesting? One could say, e.g., if similar, Shakespeare’s fiction accurately simulates real human social life (Dunbar's number serving as an auxiliary hypothesis for social life), but this would be unlikely be true due to reasons pointed by reviews showing how such simulation appears to be very inaccurate if we look at details. One could alternatively say, as you hint on page 5, that “drama is especially effective if it mirrors reality” i.e., if similar, one of the reasons for Shakespeare’s success is that people are able to cognitively reflect on social networks, which are (on average) similar size to theirs. Again, this seems unlikely for various reasons (which we don’t need to discuss here). In sum, there are interesting data and analyses, but we are not fully sure what the results will tell us (beyond statistical outcomes). The same applies to RQ2: “The theory that the average conversational clique size which is between three and four people can be found in Shakespeare plays”, and RQ4 “The theory that Shakespeare plays as dramas are in an Aristotelian view reflecting reality and show similar small world-properties in their networks.” I want to be very clear that it is fully ok to register exploratory analyses, and there is no need for confirmatory tests in RRs, but currently the MS is sitting between the two sides without having fully outlined the rationale (how do these exploratory analyses contribute to the literature, or what does it mean if a certain position/theory is falsified).
4. The fourth (anonymous) reviewer is an expert in Shakespeare as well as literature in general. Because the review was not submitted via the system, I am attaching it manually at the end of this recommendation. This (the most critical) reviewer is explicitly concerned that Dunbar’s number is not suitable for drama in general due to huge genre variation. If you agree and believe that this may be true, it is one of the possible hypotheses to test and, if corroborated, it could make a major contribution to the literature on fictional social networks and their analyses.
5. If you follow the reviewers’ suggestions to set smallest effect sizes of interest, please carefully justify the SESOI by some raw effect if possible; this is a recurring matter discussed in depth at PCI RR.
The reviewers also provide plenty of detailed comments on the design and methodology. Please consider them all carefully. I hope you find them useful and valuable in your revisions. Last, I want to stress that the value of this study, to me, is generally sufficient to be carried out even without the theoretical, pragmatic, or other contributions which most of my comments above address. I can see it can be a useful methodological exercise and resource for future scholars to learn from. However, I do hope you consider the above notes because with a medium effort, much more value could be generated.
If something is unclear or you wish ask anything during the revision process, you can contact me directly for clarifications or checks.
Reviewer 4 (anonymous)
Let me start by saying that I was asked to report on this research as a literary expert. I will thus not discuss the stastical side of the authors’ work, but only their potential interest and validity for literary analysis.
1A. The scientific validity of the research question(s)
The study they intend to replicate had virtually no value for literary study. The selection of the 10 plays made little literary sense, because a) it focused only on those plays that most coveniently agreed with the “Dunbar number”, and b) ignored what is crucial from the viewpoint of drama (and of dramatic networks), namely the difference in genre. On a) this replication may indeed prove useful, in testing, and almost certainly falsifying, the original study. On b) no, because the study shows to have a total disregard for dramatic genre. (Genre is meaningful because comedies have always a much higher density than tragedies which have a much higher density than histories; ignoring this initial fact creates only confusion.)
In addition, the (infrequent) moments in which the study mentions literature its categories – and references – can only be described as primitive; even when they refer to quantitative and/or network analysis of drama they mention very peripheral studies, and ignore crucial ones – such as Yarkho’s on speech groups.
1B. The logic, rationale, and plausibility of the proposed hypotheses (where a submission proposes hypotheses)
I do not believe a hypothesis is being proposed.
1C. The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis or alternative sampling plans where applicable)
I am not qualified to evaluate that.
1D. Whether the clarity and degree of methodological detail is sufficient to closely replicate the proposed study procedures and analysis pipeline and to prevent undisclosed flexibility in the procedures and analyses
It might ; I am not qualified to judge. But the question assumes that the original study deserves to be replicated – an assumption I personally consider groundless.
1E. Whether the authors have considered sufficient outcome-neutral conditions (e.g. absence of floor or ceiling effects; positive controls; other quality checks) for ensuring that the obtained results are able to test the stated hypotheses or answer the stated research question(s).
I am not qualified to evaluate that.