Comics in Education

ORCID_LOGO based on reviews by Adrien Fillon, Benjamin Brummernhenrich, Solip Park and Pavol Kačmár
A recommendation of:

Learning from comics versus non-comics material in education: Systematic review and meta-analysis


Submission: posted 16 October 2023
Recommendation: posted 13 June 2024, validated 21 June 2024
Cite this recommendation as:
Karhulahti, V. (2024) Comics in Education. Peer Community in Registered Reports, .


Especially after the impactful experiments in modern comics (e.g. McCloud 1993), research interest in the medium increased with new practical developments (Kukkonen 2013). Some of these developments now manifest in educational settings where comics are used for various pedagogical purposes in diverse cultural contexts. To what degree comics are able to reach educational outcomes in comparison to other pedagogical tools remains largely unknown, however.
In the present registered report, Pagkratidou and colleagues (2024) respond to the research gap by investigating the effectiveness of educational comics materials. By means of systematic review and meta-analysis, the authors assess all empirical studies on educational comics to map out what their claimed benefits are, how the reported effectiveness differs between STEM and non-STEM groups, and what moderating effects complicate the phenomenon. With the help of large language models, all publication languages are included in analysis. 
The research plan was reviewed over three rounds by four reviewers with diverse sets of expertise ranging from education and meta-analytic methodology to comics culture and design. After comprehensive revisions by the authors, the recommender considered the plan to meet high Stage 1 criteria and provided in-principle acceptance.
URL to the preregistered Stage 1 protocol:
Level of bias control achieved: Level 3. At least some data/evidence that will be used to the answer the research question has been previously accessed by the authors (e.g. downloaded or otherwise received), but the authors certify that they have not yet observed ANY part of the data/evidence.
List of eligible PCI RR-friendly journals:
1. Kukkonen, K. (2013). Studying comics and graphic novels. John Wiley & Sons.
2. McCloud, S. (1993). Understanding comics: The invisible art. Tundra.
3. Pagkratidou, M., Cohn, N., Phylactou, P., Papadatou-Pastou, M., & Duffy, G. (2024). Learning from comics versus non-comics material in education: Systematic review and meta-analysis. In principle acceptance of Version 4 by Peer Community in Registered Reports.
Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #2

DOI or URL of the report:

Version of the report: 1

Author's Reply, 10 Jun 2024

Download author's reply Download tracked changes file

Dear Dr. Veli-Matti Karhulahti and reviewers,

Thank you for the feedback regarding our submission of the manuscript “Learning from comics versus non-comics material in education: Systematic review and meta-analysis” to the PCI RR. We believe that your thoughtful comments and constructive suggestions have strongly benefited the quality of the manuscript. We are excited to inform you that we have diligently incorporated your feedback, making the necessary changes to enhance the overall quality of our manuscript; and that we agree with your assessment and prefer to adhere to the exploratory plan for the reasons discussed in the previous review round. Your recommendations have played a pivotal role in refining this updated version, and we truly appreciate your commitment towards this end. We have listed a point-by-point overview of the changes below.

Thank you for your invaluable input and the positive impact it has had on our manuscript.

Yours sincerely,
The authors.

Decision by ORCID_LOGO, posted 11 May 2024, validated 11 May 2024

Dear Marianna Pagkratidou and co-authors,
Thank you for all careful revisions. We were again lucky, as all four reviewers generously returned to re-assess the plan. In general, they are all satisfied with the current version. There are a few minor final edits requested, and I let you address them respectively. 
One reviewer has left a longer comment regarding the loss of epistemic value due to the transformation of the plan from a confirmatory to an exploratory one. Although I agree that, in the reviewer’s own words, your team would be in “a very good position to formulate these theoretical arguments,” for me the lack of confirmatory inference (which would require further assessment of effect size meaning etc.) is not a significant loss here. Oftentimes the labor required to convincingly test a hypothesis in a confirmatory setting is larger than the potential gains. A well-designed exploration can be almost equally informative, especially as an RR that is transparent for readers (who can then respectively assess to what degree these rigorously obtained effects are meaningful). This also solves the language issue, which some reviewers were still worried about (misinterpretations might be fatal in confirmatory but not exploratory tests).
That said, if you find the reviewer’s argumentation convincing and still wish to add a formal hypothesis, please contact me before submitting the next version so that we can together ensure that the carefully crafted current plan doesn’t break.
Two short notes.
1. Although you’ve decided to refrain from h-testing, page 18 (and abstract) still mention h-testing. If the study is exploratory, the reference to h-testing should be removed to avoid confusion: the study will simply report the obtained effect(s) and discuss what these effects may mean (without making confirmatory claims about effectiveness). In such exploratory design, there is also no need to justify effect sizes anymore -- it’s ok to plan e.g., equivalence tests, just keep in mind that you cannot claim to confirm the null at Stage 2 (confirmation would require formal tests against justified effect sizes). 
2. As reviewers point out, it would be important to add further information about inter-rater reliability. We recently created reporting guidelines for transparent coding, which you may find helpful for adding the missing information:
After minor revisions, I believe the plan can be ready for IPA and does not require further external review rounds. However, I might briefly consult selected reviewers, depending on the changes in the next version. Again, you are free to contact me before submitting, in case you’re unsure how to solve some of the requested revisions.
Good luck and best wishes!
Veli-Matti Karhulahti

Reviewed by ORCID_LOGO, 22 Apr 2024

Dear Authors and Editor,

I am sorry for the delay, I thought I submitted a review, but it seems to have been lost. I don't have much to say about the current draft, as I am satisfied with the author's changes regarding my contribution to the meta-analytic properties of the present study. I saw that the comments where more extensive regarding the field of investigation, on which I don't have any expertise. 

Again, I think that once the authors, editor, and reviewers agree on the specific area to investigate, the protocol described in the method section can correctly test a meta-analytic effect of comics on learning.

Good luck with the analysis.


Reviewed by ORCID_LOGO, 29 Apr 2024

I am glad that the authors have taken the time to revise their manuscript and plans. In my opinion, many parts of the plan are stronger in this version as a result of dealing with many issues that the reviewers had pointed out. Especially the details of the empirical part, such as open access to data and materials, stronger specification of the meta-analytic process, including effect size thresholds etc., are much clearer now.

I also think that the specification of texts instead of general "non-comics" is a very sensible step. However, I feel that the way that the authors have dealt with one of the other main points of my (and some of the other reviewers') assessment - regarding the theoretical foundation of the hypotheses - is not ideal. The authors have opted to do away with the hypotheses, labeling all analyses as exploratory. This may have been a response to the recommender's encouragement "to either formulate a theoretical, empirical, or other basis for testing the chosen hypotheses or transforming the plan toward a more exploratory direction". But I feel that this makes the whole plan weaker instead of stronger. The point of a meta-analysis - in my understanding - is to ascertain whether an effect exists and maybe estimate its size. It is a confirmatory exercise that you would only undertake if there is a sensible reason to expect an effect. The authors now argue that "it is not our claim that comics are beneficial for comics [sic], but rather to interrogate the claims by a range of other scholars that comics are beneficial for education". I find that argumentation questionable. The authors acknowledge that the claim exists and that it is this claim that is being assessed - how is this then different from a hypothesis that is being tested? I find it especially surprising considering that the authors have done research in this area themselves and that the keywords include some of the constructs that arguably play a role in mediating these effects (e.g. visual language fluency and spatial cognition). As far as I can tell, the authors seem to be in a very good position to formulate these theoretical arguments, given their expertise and experience in the domain.

The authors demonstrate that they have this expertise by presenting some theoretical arguments for the plausibility of the claimed effects. In my view this is very valuable and, instead of dropping the hypotheses, this should be extended into - as was the original aim per the title of the manuscript! - a systematic review of the theoretical and empirical arguments for the existence of the effects that the meta-analysis will focus on: Why can we expect comics be superior to texts in certain situations? Why should this apply specifically to STEM fields? This should be much easier to accomplish now that that the focus is on texts as the comparison point instead of all non-comics material. This would be a very worthwhile endeavour as it would enable the authors to both present empirical evidence as well as conclusions on a theoretical level. Especially if the effects do not obtain in the expected direction, this would enable a more fruitful discussion of the factors that may be responsible for this outcome, such as specific characteristics of the study designs or materials of the studies included in the meta-analysis.

This is still the main issue I see with the plan. Again, I think this is a very worthwhile endeavour, and it would be much more so with more reasoning about the plausibility of the effects under analysis, not less. I will report some smaller details that have come up in the revision in the following.

  • I wonder if the title is still appropriate. It still says "Learning from comics versus non-comics", whereas the focus is now specifically on the comparison with texts.
  • The meta-analysis by Topkaya et al. (2023) is cited in the text but does not appear in the References section.
  • Table 1 does not have a caption.
  • Right now, Table 1 seems a little superfluous because much of it is redundant: the entries in all cells for the columns "Hypotheses", "Sampling plan", "Analysis plan", "Rationale for deciding the sensitivity etc" and "Theory that could be shown wrong etc." are the same for all three research questions.
  • There is a formatting problem in the References section as some references are indented, some not.
  • I do not think generative AI solves the translation problem mentioned by myself and others; it may even exacerbate it. You cannot check the accuracy of the translation for languages that you do not know. The example given by Solip Park was an enlightening one and only one of the many problems that could arise.
  • I welcome the calculation of inter-rater reliability, but would have expected a little more detail, closer to what was reported in the "Statistical analysis" section: How will the agreement be calculated, what coefficient will be used, and what is the cutoff for a satisfactory agreement? What happens if this is not reached?

Again, I hope these comments are helpful to the authors and I would very glad to see this manuscript go to the next round!

Reviewed by , 10 May 2024

Download the review

Reviewed by ORCID_LOGO, 19 Apr 2024

I would like to express my gratitude to the authors for their effort in revising the proposal and incorporating suggestions.

I have no further input on how the proposal could be improved and I believe that the present version of PCI RR - entitled Learning from comics versus non-comics material in education: Systematic
review and meta-analysis - met stage 1 criteria and could be recommended.

I eagerly await the the study findings, and I wish the authors all the best.

Best regards, Pavol Kačmár, PhD. 

Evaluation round #1

DOI or URL of the report:

Version of the report: 1

Author's Reply, 05 Apr 2024

Decision by ORCID_LOGO, posted 28 Nov 2023, validated 28 Nov 2023

Dear Marianna Pagkratidou and co-authors,
Thank you for submitting your highly interesting Stage 1 manuscript to PCI RR. We have been very lucky to receive no less than four helpful reviews, from experts of education, meta-analyses, and comics. All reviewers are generally positive about the research plan and support inviting a revision. Because the reviews are extensive, I will minimize my own comments and merely recap the most significant points that should be focused on in the revision.
1. The reviewers consistently point at the problematic comparison between “comics and non-comics”. I agree with them and encourage you to follow any effective solution of your preference, perhaps one of those kindly suggested by the reviewers. 
2. The reviewers also voice an issue of testing hypotheses without an underlying theory or other explanation that would justify the hypotheses (a hypothesis without reasoning is just a guess!) I tend to agree and, as the reviews suggest, encourage you to either formulate a theoretical, empirical, or other basis for testing the chosen hypotheses or transforming the plan toward a more exploratory direction. If you choose to keep the hypotheses for testing, please move the supplement table (from OSF) to the end of the revised manuscript text file. 
3. As the reviewers note, PCI RR generally discourages using rules of thumb effect sizes (like those by Cohen) and instead justifying the range of meaningful and meaningless effects. A good additional source for this topic suggested by the PCI RR guidelines can be found here (Dienes 2021):
4. Finally, I echo the reviewers’ concerns about including any languages. It feels challenging to build a robust systematic plan that can ensure access to any language (both in terms of locating and reviewing relevant studies). A good solution could be to either limit the languages to those for which the authors and their collaborators have direct fluency, and/or review other languages in a separate exploratory section.  
I hope the reviews, as summarized by these notes, help you to make the plan even stronger. Please kindly include point by point responses to all the review comments in the revision. 
I want to be clear that the idea of RRs is not to block any preferred research goals or force researchers to directions that they are not interested. Accordingly, if you consider some of the feedback not justified or that revising based on the feedback changes the plan too much from your goals, you are free to rebut any comment with a counter argument. If needed, you can also contact me directly during the revision process and we can together negotiate solutions for any part of the feedback in more detail. Our goal is simply to collectively make this the best possible study on your chosen topic. 
All the best wishes and much looking forward to the next version,
Veli-Matti Karhulahti


Note from the Managing Board: Please note that to accommodate reviewer and recommender holiday schedules, PCI RR will be closed to ALL submissions from 1st December 2023 until 10th January 2024. During this time, reviewers will remain able to submit reviews, and recommenders can issue decision letters for ongoing submissions, but no new or revised submissions can be made by authors.

Reviewed by ORCID_LOGO, 31 Oct 2023

Please look at the attached document,

Adrien Fillon

Download the review

Reviewed by ORCID_LOGO, 23 Nov 2023

The authors describe a planned review and meta-analysis that is concerned with the effect of using comics as educational materials on knowledge gains. I think the topic is relevant for the field and would find a systematic analysis of the effectiveness of comics worthwhile. However, although the Stage 1 Registered Report makes the procedure reasonably clear (with some exceptions which I will detail below), I am unsure whether the question, as the report currently poses it, is a well-formulated and reasonable one. I will first state the main problems that I see with the authors' reasoning and approach.

There are three main problems regarding the derivation of the research question (or the effects to be meta-analysed), that make me question its reasonableness: 1. What media comparisons are sensible? 2. What is a comic and in what way exactly does it differ from other (visual) media? 3. What makes STEM subjects different and how does this impact learning with media?

There is a long history of media comparison research in educational psychology, but it is also a very rocky one. A prominent example is the Clark-Kozma debate on learning with digital media, that revolved around the question whether the use of (digital) media per se had specific impacts on learning or whether any effects were exclusively a consequence of teachers structuring the instruction in a specific manner around the medium. I think the same question has to be asked here: Is there a specific effect of comics as a medium or does this depend too strongly on how they are used, in which context and to what end? Even if the authors expect a specific effect, I found that the introduction did not make it very clear how this effect comes about. Does it have to do with the sequentiality of comics? With the combination of text and images? With the fact that comics will be perceived as entertaining by students? This leads to the second problem: What exactly is the comparison point?

The authors define comics as "as a particular type of social object, used by people of a particular cultural orientation, which use visual language (sequential images) and writing, typically associated with contexts and styles" (p.5). Some of these concepts are not explored further (what cultural orientation? what kind of contexts and styles?), so I have to assume they are not important here. If it is the sequentiality and the combination of images and writing, I wonder what it is about these that should make comics superior to other media.

The problem here is that the authors plan to compare comics to "non-comics", i.e. basically all other kinds of media - or at least visual media - that are not comics! That in itself seems like a very asymmetric comparison. The authors state that "media comparisons also differ between text [...], animation [...], or video." (p. 7) This not only ignores the heterogeneity of these media types in themselves but also the host of other visual media. Images (moving or non-moving) can be (such as a diagram) or analog/depicting (such as a photograph). Images and text can be combined, which is a whole area of research (e.g. Schnotz's "integrated model of text and picture comprehension", 2005), of which comics would arguably be a special case. The problem that I see is that whether comics are more or less effective depends not only on the context and content (see problem 1.) but also on which medium specifically they will be compared to, because in what way comics differ from another medium determines whether an effect obtains or not. Comics and videos are both sequential, but comics are combined with text and videos (typically!) are not. If you compare comics with infographics the opposite is true!

In my opinion, the authors need to offer a clear account what characteristic of comics it is that provides the benefit, and in what situation. That precludes, in my view, a broad comparison of comics vs. non-comics because the characteristics that differ will vary. In some cases e.g. sequentiality may provide a benefit, in some cases it may not, or may even be detrimental. I am not convinced that the moderator variables that the authors consider adequately capture these factors. In this way, I think the authors' first hypothesis is not well argued.

The third point is related to this one, and that is the STEM/non-STEM comparison. Here, again, it is unclear which characteristics of STEM subjects make comics especially apt for these contexts: The authors state that "the background information provided in the panels might operate as scaffolding for more effective learning about STEM than non-STEM concepts" (p. 9) - in what way are STEM concepts different such that comics are a superior medium than others to learn them? Again, I think the report lacks a coherent theoretical reasoning from which to derive this hypothesis.

I realise that I'm basically asking "What is the process?", a kind of question that is receiving some push-back recently. I acknowledge that investigating effects can be enlightening without a strong theoretical reasoning, but this is, in my reading, not what the authors are setting out to do, especially since they are formulating hypotheses around the specific comparisons. Thinking about the title it occurred to me that the problem may be that the authors want to make this a systematic review as well as a meta-analysis. Although the review part is not elaborated upon very strongly, it suggests to me that the authors are looking at both theoretical as well as empirical aspects of the effects of comics.

In summary, I personally do not think it would be worthwhile to go through with the plan as the authors have formulated it. The reasoning why comics should be superior to all "non-comics" media does not seem sound to me, neither why comics should be more effective in STEM than in non-STEM comics. However, as I said above, I think the topic in itself is worthwhile of consideration. In the sense of constructive criticism, I personally see two ways for the authors to proceed in order to make this a worthwhile line of inquiry:

The first would be to elaborate more specifically on the characteristics of comics that make them special, that distinguish them from some specific other type(s) of media. Think about which of these characteristics should improve learning gains in specific situations (i.e. regarding a certain type of content, when used in a certain kind of way, etc.), what kinds of affordances do they bring to the learning situation. Then focus on this (set of) affordance(s) and review/meta-analyse studies that pertain to it. Let us assume (just for the sake of argument, I am not an expert in this specifically) that the sequential nature of comics should be especially beneficial for beneficial for learning procedural knowledge. Then studies could be sought out that compare comics with text-image combinations that are not sequential. The type of knowledge learned could be a moderator here, and effects should be stronger for procedural knowledge than, say, conceptual knowledge.

However this may be premature if the theoretical ideas around learning from comics are not yet specific and precise enough. In this case I would suggest - as my second proposal - to defer the meta-analysis to a later point in time and focus on the systematic review in order to tease out those characteristics. The strategies that the authors describe for literature search and categorisation would still be useful for this. The authors could more strongly focus on how comics are actually implemented in learning situations, with what kinds of media they are contrasted etc.

I hope the authors take these comments in the manner they were intended: As constructive criticism and in the hope that they are helpful for revising their approach, if they wish to do so.

In the following I will go through the text of the report in order, to point out additional smaller things as they occurred to me:


  • I assume ASD refers to "autism spectrum disorder" but I am not entirely sure and I think the authors should write it out.
  • The researchers seem to be active in the field - they should be explicit about how they can ascertain that their own studies will not be given preferential treatment or weight in their literature search and the following categorisation and analysis.
  • In Table 1, the authors first formulate directional hypotheses ("We expect comics to have a greater impact..."). However in the "Interpretation" column, they also allow directional interpretations in opposite directions. So I am unsure if there are competing, directional hypotheses (which would be fine if both were argued for!) or if the authors allow themselves to support any kind of hypothesis.


  • - "We will sequentially screen titles, abstracts, and full-texts." (p. 11) I am unsure what this entails and what exactly the criteria for inclusion are. The goal of the registered report is to enable replication and guard against procedural flexibility and I think this is one point where subjectivity may come in
  • Will the Zotero database be published?
  • I am unfamiliar with Rayyan and would need more information about how it works. One relevant question regarding reproducibility would be how determinate its output is.
  • Regarding inclusion criteria:
    • I was unsure what the authors meant by "general population samples" (p. 11). Are there any samples that this excludes? If not, then I think that this is not really a criterion. It could be a moderator variable though.
    • I was unsure what exactly the authors consider to be "sufficient data" (p. 11). What does it mean for data to be "usable for the analysis"? What kind of statistical values would need to be reported?
    • As I mentioned above, the authors would need to be explicit about which outcome measures would be eligible and which would be excluded. What kind of learning will be targeted? Will studies be excluded that report motivation measures? Small-group interaction? Procedural knowledge? etc.
    • Re: publication type/research design: What about studies with more control groups or different kinds of comics? What will suffice as a control group? What if the second group is not labelled as a control group?
    • I am very doubtful about including studies in languages other than those that the authors are proficient in. Although Google Translate has come a long way, I find that especially scientific jargon can be a gamble.
    • I was unsure what the authors meant by "We will extract the sex of the participants separately for each group (experimental vs. control)" How will this enter the analysis. The ratio of of male to female participants per experimental group? Whether the study tested male vs. female? Whether gender was a covariate?
    • The distinction of comics as complementary vs. main medium seems very vague to me. There should be more information what characterises each application.
    • In general, because some of these categorisations are at least partly matters of judgment, involving some uncertainty, I think only resolving inconsistencies by a third person is inadequate. My suggestions would be for the two coders to categorise a subset of the data, for which inter-rater reliabilities are then calculated, refining the system until agreeable consistency is achieved. This procedure and what is an acceptable level agreement (e.g. Krippendorff's Alpha or similar) should be made explicit in the pre-registration.
    • I was unsure What exactly was meant by "level of knowledge-achievement". I thought that knowledge was the outcome. Maybe the authors mean prior knowledge and whether that was entered as a covariate.
    • The categorisation of the control conditions, as mentioned above, falls short, in my opinion. Media use in education is very diverse, other forms of images and visual media, with and without text, sequential or non-sequential, symbolic (such as diagrams) as well as analogue (such as depictions) are ubiquitous. To choose only photos and animations as contrasts offers a very restricted view of visual media in the educational context.
  • Statistical analysis
    • The outcome is stated as "knowledge". Both the concept of knowledge itself but also its measurement are very complex. In educational studies, several different kinds of knowledge are common as outcome measure. However, these are not equivalent. Knowledge can be declarative, conceptual, procedural etc. It can be measured by multiple choice tests, written or oral exams, behavioural measures, teacher ratings, peer ratings, self-report etc. The studies will surely differ widely in this regard and needs to be represented in the categorisation of the studies.
    • Apart from this, I found the statistical analysis section the most convincing. (However, I am not very familiar with meta-analytical methods so an expert on the topic may have more to say on this topic.)
    • How do the authors justify choosing r = 0.5 for studies that do not provide a coefficient for calculating d for repeated measures?

Reviewed by , 03 Nov 2023

Summary of this RR

Aim: The authors seek to explore whether comics affect learning differently.

Problem statement: Despite the increase in the usage of comics in classrooms daily either as independent reading or as a supplement to the main lesson, there are inconsistencies in findings regarding the effectiveness of learning when using comics compared to non-comics material.

Goal: To systematically review and quantify using meta-analysis the overall effect of comics vs non-comics material that have been used in empirical studies that targeted learning for STEM and non-STEM fields.


Research setting:

In RQ1: How authors will measure the “better learning”?

In RQ2: How authors will measure the “putative effectiveness”?

This is perhaps the main point to measure to extract ‘Achievement level’ from the data.



Perhaps a few examples of what the authors expect when saying “…empirical studies that compare comics with any non-comics material”. Because the abstract says “e.g., text or video” and the data extraction chapter mentions the three main non-comics materials in criteria: text (=1), photo (=2), animation video (=3). But what about else? What about those that are ‘somewhere’ in between? What first comes to my mind is whether ‘visual novels’ and ‘comical storybooks’ could be regarded as ‘comic’ in this systematic review, as long the authors of the original paper have identified the object as “comic”. And what about games (e.g., board games) – would they be regarded as text, photo, or animation video? To avoid these potential terminological confusions, perhaps the authors could consider listing some examples and rationale behind these choices.

Google Translate’s effectiveness in some languages can be somewhat questionable. For example, Korean and Japanese (the language that I can speak) have multiple vocabularies of “comic” (e.g., comic, manga/manhwa, toon, webtoon, etc) with subtle differences in nuance and tone, which I wonder how accurately the Google translation AI can able to articulate. I quickly checked that my native language Korean, comics (“Manhwa”) either translated as “comic” or “cartoon” into English Google Translation.

I consider it would also be beneficial for the readers whether the comic or non-comic materials used in this research will be the ones developed by the researcher team themselves or outsourced (or using or modifying existing materials) from external sources. It would also be interesting to see how it corresponds to the learning outcomes. (Or is the “Experimental design” criteria already covering this aspect? I wasn’t sure from the current RR.)

Reviewed by ORCID_LOGO, 27 Nov 2023

Thank you for the opportunity to review the protocol: Learning from comics versus non-comics material in education: Systematic review and meta-analysis. The review is attached as a pdf file. 

Download the review

User comments

No user comments yet