Getting the numbers right in Parkinson's disease?

ORCID_LOGO based on reviews by Pia Rotshtein, Ann Dowker, Stephanie Rossit and 1 anonymous reviewer
A recommendation of:

Arithmetic deficits in Parkinson's Disease? A registered report


Submission: posted 29 June 2021
Recommendation: posted 08 February 2022, validated 08 February 2022
Cite this recommendation as:
Dienes, Z. (2022) Getting the numbers right in Parkinson's disease?. Peer Community in Registered Reports, .


Everyday life, including for patients taking different types of medicine, involves dealing with numbers. Even though Parkinson's disease may ordinarily be thought of as primarily being a motor disorder, there is evidence that numerical abilities decline as Parkinson's disease progresses. Further, the brain areas involved in arithmetic operations overlap with the areas that degenerate in Parkinson's disease.

In this Stage 1 Registered Report, Loenneker et al. (2022) will test healthy  controls, Parkinson disease patients with normal  cognition, and Parkinson disease patients with mild cognitive impairment on general working memory tasks as well as arithmetic performance on the four basic  operations (addition, subtraction, multiplication, division). The study aims to test whether or not there is a deficit in each operation, and the relation of any deficits to general working memory capacity.

The Stage 1 manuscript was evaluated over four rounds of review (including two rounds of in-depth specialist review). Based on comprehensive responses to the reviewers' comments, the recommender judged that the manuscript met the Stage 1 criteria and therefore awarded in-principle acceptance (IPA).

URL to the preregistered Stage 1 protocol:

Level of bias control achieved: Level 6. No part of the data or evidence that will be used to answer the research question yet exists and no part will be generated until after IPA.

List of eligible PCI RR-friendly journals:


Loenneker, H. D., Liepelt-Scarfone, I., Willmes, K., Nuerk, H.-C., & Artemenko, C. (2022). Arithmetic deficits in Parkinson’s Disease? A Registered Report. Stage 1 preregistration, in principle acceptance of version 4 by Peer Community in Registered Reports.

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Evaluation round #4

DOI or URL of the report:

Version of the report: Manuscript_Loenneker_reg_rep_acalculia_PD_revision3

Author's Reply, 04 Feb 2022

Decision by ORCID_LOGO, posted 21 Jan 2022

Dear Hannah

The Tattan-Birch calculator assumes the same units for both mean and SE for the likelihood AND for the parameters of the model of H1. That is, you can model H1 with a Cauchy centered on 0 as you do, but the scale factor should be in units of  the mean difference. (On that point always state your units, e.g. "Mdifference = -5.3": 5.3 what? %, ms, etc.) Thus, a rule of thumb is that if the best estimate from past studies is about 5 (of whatever units, e.g. %), then use 5% as the scale factor. Note this is different from e.g. JASP (in most cases) or the Rouder online calculator which can only use Cohen's d (or dz) units for the scale factor. You could use the JASP/Rouder/Morey calculator with Cohen d units and hence use a scale factor of 1; or use the Tattan-Birch(/McLatchie/Colling/Dienes) with scale factor 5.  This will change the interpretation of your Table indicating the robustness of the results, and whuich values you will likely want to put  in the table.



Evaluation round #3

DOI or URL of the report:

Version of the report: Manuscript_Loenneker_reg_rep_acalculia_PD_revision2

Author's Reply, 20 Jan 2022

Decision by ORCID_LOGO, posted 03 Dec 2021

Sorry, just a couple of minor things.

This may have been clear to me before but it is not now. Why is your robustness analysis relevant to your proposed analysis? Make sure you state the model of H1 you will use (saying "uninformed" does not specify as there a number of "uninformed" models of H1). I presume you mean a Cauchy centred on zero with scale factor  = 0.7 in d units. Your simulations use different  centring and scale factors. What would you get if you used your proposed model?  H0 is typically a spike at 0. Why not use this in your robustness analysis? Indicate the standardized effect size of the previous study you base your reasoning on, to give some justifcation for why 0.7 may be relevant. (Bear in mind there are really no "uninformed" models of H1; every model is a conjecture about the probable effect sizes in a scientific context, so a default is just a suggestion you should check for adequately representing the plausibility of different effect sizes in your context.) Apologies if I missed an explanation somewhere!

And while there is a chance a small matter. In the design table for your third section, hypothesis column: b) "It is not clear whether..." is not a hypothesis; rephrase as a hyptohesis that could be refuted by the data.

Evaluation round #2

DOI or URL of the report:

Version of the report: Manuscript_Loenneker_reg_rep_acalculia_PD_revision1

Author's Reply, 25 Nov 2021

Decision by ORCID_LOGO, posted 09 Nov 2021

The reviewers are very happy with your revisions. Roshtein asks you to consider some possible additional control groups; you can decide the feasibility of this. 

One further point. Exploratory analyses are not pre-registered in Stage 1. In your design table you list one set of analyses as exploratory and without hypotheses; yet your interpretation involves conlusions about substantial theory. In what sense do you mean they are exploratory? If you mean you see analytic and interpretative flexibility, then they should not be registered in the Stage 1; but you are free to perform them at Stage 2 in a non-pre-registered section. I am not sure you mean this, as it seems there is clear theory at stake and you can specify the analyses and the interpretation. Please clarify.

You could also simplify in another way: You propose for some analyses to first do an omnibus ANOVA and then the planned contrasts; but conclusions follow only from the contrasts, which also define the stoppping rule. You need therefore only perform the contrasts. You might wish to have a separate row for each contrast to indicate what theoretical claim precisely is at stake for each contrast. At the moment it is not clear what follows from one contrast showing an effct and the other showing no effect.

I won't be sending back to review in order for you to obtain IPA, as the reveiwers are very positive; I will make that decision myself.

Reviewed by , 05 Nov 2021

Dear Editors, I am very happy to endorse the publication of this manuscript and have no further comments as the authors have done a fabulous job at revising. I look forward to learning about the results of the study in due course and wish the authors the best of luck!

Reviewed by , 08 Nov 2021

I first like to thank the authros again for a very through and thoughtful work and response to our comments. I also apology for the delay in revieiwng the paper.

While the work is excellant and through it is also very complex. I fear this will be a future barrier for readers to appraciate the quality, especially if you want to communicate the results to medical and clinicians. am not sure what the authors can do to simplfy it, but maybe it so something to be minded in the report of the results. 

Two final thoughts:

Control group:

1) I apperciate that an MCI control group is likely to be hetrogenous and may include some ovrelapping PD like eatiology. Please make sure you bring it up in study limitation - that difference between PD-MCI vs. PD-NC may reflect comorbidity with other degenration symptoms rather than relates to PD severity per se. 

2) Given the potential difference in socio-demographics between PD-NC and PD-MCI, maybe include to healthy control groups with a match to each group. or based on the adapt analysis approch using two t-tests: PD-NC vs HC, PD-MCI vs PD-NC, than make sure the HC matches the PD-NC better.


Thank you

PS, Zoltan I do not need to review it again, I am happay with whatever decision you will make regarding the authors responses.  

Evaluation round #1

DOI or URL of the report:

Author's Reply, 22 Sep 2021

Decision by ORCID_LOGO, posted 19 Aug 2021

Dear Hannah


I now have two reviewers reports for your manuscript. Like the reviewers I thought your submission was well thought through. The reviewers have some excellent points to make I ask you to respond to in a revision. I also have some points for you to address:

Put the study design table in the manuscript.

"The difference between PD-NC and PD-MCI  cannot  be  inferred  from  current  literature,  which  is  why  we  only  predict  a  trend  of  HC outperforming PD-NC and PD-NC outperforming PD-MCI."

I am not sure what you mean by a trend here. It sounds like you predict two pairwise comparisons. Please clarify. Why not explicitly predict these two, and say for each comparison what theory hangs on the line?

What you say in the
 "sampling plan" part of the table is "participants will be tested within a sequential Bayes factor design until the between-subjects factor of group reaches a value of BF10≥ 6 or BF01≥ 6 for all of the four basic arithmetic operations in research questions (1) and (2)."
THis doesn't test a trend but an omnibus main effect of group. As an omnibus test it does not precisely test your claims - and requires more subjects than a 1-df test. You could think about two pairwise comparisons as I say.

You do not specify a predicted effect size for tests for question 3, nor for covariates in the previous analyses.
You might (or might not) find advice here helpful:
Dienes, Z. (2019). How do I know what my theory predicts? Advances in Methods and Practices in Psychological Science, 2, 364-377.

One thing I will recommend from that paper is reporting a robustness region for each test, i.e. the set of scale factors that result in the same qualitative conclusion as you reach with your pre-specified scale factor.

There is still a fair amount of wriggle room for your conclusions for question 3 - please tighten up.
Don't register your exploratory analyses in Stage 1, i.e. those for which you haven't tied down your analytic options. Just put them in a non-registered section of your results when you submit Stage 2.

Reviewed by , 21 Aug 2021


Download the review

Reviewed by , 09 Aug 2021

Loenneker and colleagues proposed to register a study that examine simple arithmetic abilities in two group of patients suffering from Parkinson, those who show no cognitive deficits (PD-NC)  and those who show mild cognitive impairment (PD-MCI) in comparison to healthy controls (HC). The study is paired with an additional registered study that examine non-symbolic quantity abilities in the same sample.
Participants will be assess on addition, subtraction, multiplication and division. Accuracy, reaction time and some derivatives measures (differences between trials: borrow and carry over effects) will be the dependent measures. The authors will also consider confounding factors related to socio-demographics and clinical symptoms.

The authors aim to answer three questions: Is PD relates to calculus deficits?; Can tobserved calculus deficits be accounted by core cognitive abilties (e.g. attention, executive function,   working memory)? Can calculation problem can be used to detect early PD (PD-NC vs. HC), or emerge at later PD stages (PD-NC vs PD-MCI).
Overall, this is a very large and impressive project. It is very comprehensive and it is clear that the authors have considered the study design very carefully and thoughtfully. The project is exploratory by nature and given it extent and number of measurements it is quite complex.  
Some points to consider
-       The main point I want to raise regards the nature of the healthy control group. This is a general comment that probably relates to most studies that uses neuropsychological measures to characterise a specific patient group. the quetsion is who is an appropriate control group that differ only on the factor of interest, here PD.

o   The authors should be commented that they aim to recruit carer of the PD patients, which is likely to account for some of the socio-demographic confounds.

o   I suggest the recruitment of the control group be more purposeful with the aim of matching them as much as possible with all aspect of the two PD groups.
§  Matching the three groups on age, gender, education, SES and at least childhood testified confidence in math.
§  Including two HC groups, one with no mild cognitive impairment and one with MCI is important for arguing that it is the PD that lead to acalculia rather than MCI.

-   If I understand correctly the upper limit for participants number is 120.

o Will these be equally divided across all three groups?

o Could you provide a power analysis for 120 participants for a medium effect size, as this is the maximum power you can be achieved.

o It can be that when considering all the additional covariates 120 is too small group.

-       The analysis suggests to exclude participants who got the first 10 question wrong in one of the tasks. This may dilute your effect. Maybe in the practice session include some very simple arithmetic to ensure that they understand the task instruction. Then if they understand the task but nevertheless fail the first 10 questions you can just award them a zero score.  
o   Will the first 10 be the easier examples, less complex?
o   Will participants be asked if they want to stop or continue?
o   Will participants received feedback on their answers?

-       The number of planed tests is huge. It is good you opted for Bayes factor as you will not need to compare for multiple comparison. However, if only a single of these test will results in significant results – what would you conclude on the relation of PD and calculus processing?

-       When computing the 'borrow effect' could you please provide a clear definition of problem size.
-       Could you please provide a clear definition of the task complexity factor that will be used to answer Q2.
-       Given the number of potential covariates it maybe more useful to remove variability of socio-demographic and clinical factors from the tasks performances before moving into Q2 (impact of core functions). Then removing the additional variability of cognitive covariate from task performances before using the data to answer Q3 (ability of calculus to discriminate between the three groups).
-       Could you please provide a measure that will ensure that the quality of the data is high (e.g. expecting to replicate basic accuracy and RT effect on task complexity, independent of PD, or across all groups)

- dividing the text by sub-headings, will aid the reading and comrehension of all section. using some additional tables to summarsie the information (description of tasks, measures collected) can also help.   

Reviewed by anonymous reviewer 1, 19 Aug 2021

Thank you for asking me to review this detailed, well written and interesting RR. The research question has good validity and the rational of study as well as hypothesis make sense. The sample size and inclusion/exclusion criteria are well defined and justified. Overall, the analysis is feasible and appropriate to answer the questions. However, I have identified the following concerns that warrant expanding/clarification:

-          I understand the RR will be ran alongside the other similar study already accepted-in-principle. Can you give us details on how task order and administration of both studies will be managed and lay out clearly what data overlaps between the two studies?

-          The introduction is quite long and needs to be more succinct focusing on the question at hand and leading to research question and hypothesis. Its missing a good overview of PD and its epidemiology and symptoms. Its not common to have hypothesis in supplemental so I would incorporate this within main text.

-          Within the arithmetic tasks however you propose to analyse RT. However, the response is entered by experimenter. Could you use a microphone instead and record verbal response? Otherwise, your RT measure will not be very reliable.

-          The authors should a section on limitations of the study and how these will be dealt with regarding design, analysis, and previous findings. Issues covered could include for example issues such as heterogenous clinical sample, fatigue, diagnosis errors. Finally a section on potential unexcepted outcomes should also be included including analysis or other steps taken to overcome these.

Minor issues

‘the magnetic resonance imaging volume of the angular gyri’ – this is unusual language here and instead I would remove MRI and just say cortical volume?

A few typos noted (e.g., page 5 end paragraph, line 7).

Spell out acronyms first time within text (e.g., ADL; MOCA)

Some of the paragraphs are long and should be split into separate paragraphs (e.g., page 3).

User comments

No user comments yet