2 Conceptual and Statistical Knowledge
7 sub-clusters · 82 referencesAttainment of a grounding in fundamental statistics, measurement, and its implications encompassing conceptual knowledge, application, interpretation and communication of statistical analyses. There are 8 sub-clusters which aim to further parse the learning and teaching process:
Effect sizes, statistical power, simulations, & confidence intervals.
Statistics are more than p-values and we need to use other benchmarks to determine the statistical and practical relevance of an effect. Emphasizes effect size, confidence intervals, power, and simulations to design adequately powered studies and communicate practical significance.
- Abt, G., Boreham, C., Davison, G., Jackson, R., Jobson, S., Wallace, E., & Williams, M. (2025). Sample size estimation revisited. Journal of Sports Sciences, 43(21), 2511–2516. https://doi.org/10.1080/02640414.2025.2499403
- Abt, G., Boreham, C., Davison, G., Jackson, R., Nevill, A., Wallace, E., & Williams, M. (2020). Power, precision, and sample size estimation in sport and exercise science research. Journal of Sports Sciences, 38(17), 1933–1935. https://doi.org/10.1080/02640414.2020.1776002
- Arel-Bundock, V., Briggs, R. C., Doucouliagos, H., Aviña, M. M., & Stanley, T. D. (2026). Quantitative Political Science Research Is Greatly Underpowered. The Journal of Politics, 88(1), 36–46. https://doi.org/10.1086/734279
- Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8), 980–989. https://doi.org/10.1038/s41562-021-01143-3
- Brysbaert, M., & Stevens, M. (2018). Power Analysis and Effect Size in Mixed Effects Models: A Tutorial. Journal of Cognition, 1(1). https://doi.org/10.5334/joc.10
- Buchanan, E. M., Elsherif, M. M., Geller, J., Aberson, C., Gurkan, N., Ambrosini, E., Heyman, T., Montefinese, M., vanpaemel, wolf, Barzykowski, K., Batres, C., Fellnhofer, K., Huang, G., McFall, J. P., Ribeiro, G., Röer, J. P., Ulloa Fulgeri, J. L., Roettger, T. B., Valentine, K. D., … Lewis, S. C. (2023). Accuracy in Parameter Estimation and Simulation Approaches for Sample Size Planning with Multiple Stimuli. https://doi.org/10.31219/osf.io/e3afx
- Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. https://doi.org/10.1038/nrn3475
- Caldwell, A. R., Lakens, D., Parlett‑Pelleriti, C. M., Prochilo, G., & Aust, F. (2022). Power analysis with Superpower. https://aaroncaldwell.us/SuperpowerBook/
- DeBruine, L. M., & Barr, D. J. (2021). Understanding Mixed-Effects Models Through Data Simulation. Advances in Methods and Practices in Psychological Science, 4(1). https://doi.org/10.1177/2515245920965119
- Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology, 31(4), 337–350. https://doi.org/10.1007/s10654-016-0149-3
- Hallgren, K. A. (2013). Conducting Simulation Studies in the R Programming Environment. Tutorials in Quantitative Methods for Psychology, 9(2), 43–60. https://doi.org/10.20982/tqmp.09.2.p043
- Holzmeister, F., Johannesson, M., Böhm, R., Dreber, A., Huber, J., & Kirchler, M. (2024). Heterogeneity in effect size estimates. Proceedings of the National Academy of Sciences, 121(32). https://doi.org/10.1073/pnas.2403490121
- Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4. https://doi.org/10.3389/fpsyg.2013.00863
- Lakens, D. (2022). Sample Size Justification. Collabra: Psychology, 8(1). https://doi.org/10.1525/collabra.33267
- Lengersdorff, L. L., & Lamm, C. (2025). With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests. Advances in Methods and Practices in Psychological Science, 8(1). https://doi.org/10.1177/25152459241296397
- Pek, J., & Flora, D. B. (2018). Reporting effect sizes in original psychological research: A discussion and tutorial. Psychological Methods, 23(2), 208–225. https://doi.org/10.1037/met0000126
- Perugini, M., Gallucci, M., & Costantini, G. (2014). Safeguard Power as a Protection Against Imprecise Power Estimates. Perspectives on Psychological Science, 9(3), 319–332. https://doi.org/10.1177/1745691614528519
- Wegener, D. T., Fabrigar, L. R., Pek, J., & Hoisington-Shaw, K. (2021). Evaluating Research in Personality and Social Psychology: Considerations of Statistical Power and Concerns About False Findings. Personality and Social Psychology Bulletin, 48(7), 1105–1117. https://doi.org/10.1177/01461672211030811
- Wilson, B. M., & Wixted, J. T. (2023). On the importance of modeling the invisible world of underlying effect sizes. Social Psychological Bulletin, 18. https://doi.org/10.32872/spb.9981
Exploratory and confirmatory analyses
Confirmatory analyses test a priori hypotheses against a pre-specified analysis plan (ideally preregistered/Registered Report); any deviations are documented. Exploratory analyses probe patterns, generate hypotheses, and build models after seeing the data.
- Chambers, C. (2017). The Seven Deadly Sins of Psychology. https://doi.org/10.1515/9781400884940
- Feest, U., & Devezer, B. (2025). Toward a more accurate notion of exploratory research (and why it matters). PhilSci Archive. https://philsci-archive.pitt.edu/24482/
- Jacobucci, R. (2022). A critique of using the labels confirmatory and exploratory in modern psychological research. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1020770
- Lin, W., & Green, D. P. (2016). Standard Operating Procedures: A Safety Net for Pre-Analysis Plans. Political Science and Politics, 49(3), 495–500. https://doi.org/10.1017/S1049096516000810
- Rubin, M., & Donkin, C. (2022). Exploratory hypothesis tests can be more compelling than confirmatory hypothesis tests. Philosophical Psychology, 37(8), 2019–2047. https://doi.org/10.1080/09515089.2022.2113771
- Szollosi, A., & Donkin, C. (2021). Arrested Theory Development: The Misguided Distinction Between Exploratory and Confirmatory Research. Perspectives on Psychological Science, 16(4), 717–724. https://doi.org/10.1177/1745691620966796
- Wagenmakers, E.-J., Dutilh, G., & Sarafoglou, A. (2018). The Creativity-Verification Cycle in Psychological Science: New Methods to Combat Old Idols. Perspectives on Psychological Science, 13(4), 418–427. https://doi.org/10.1177/1745691618771357
- Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An Agenda for Purely Confirmatory Research. Perspectives on Psychological Science, 7(6), 632–638. https://doi.org/10.1177/1745691612463078
Limitations and benefits of NHST, Bayesian & Likelihood approaches.
Next to frequentist statistics, there are other quantitative approaches, each with different assumptions and goals. This subcluster summarizes benefits and limitations of each one.
- Cumming, G. (2013). The New Statistics. Psychological Science, 25(1), 7–29. https://doi.org/10.1177/0956797613504966
- Etz, A., Gronau, Q. F., Dablander, F., Edelsbrunner, P. A., & Baribault, B. (2017). How to become a Bayesian in eight easy steps: An annotated reading list. Psychonomic Bulletin & Review, 25(1), 219–234. https://doi.org/10.3758/s13423-017-1317-5
- Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology, 31(4), 337–350. https://doi.org/10.1007/s10654-016-0149-3
- Kass, R. E., & Raftery, A. E. (1995). Bayes Factors. Journal of the American Statistical Association, 90(430), 773–795. https://doi.org/10.2307/2291091
- Keysers, C., Gazzola, V., & Wagenmakers, E.-J. (2020). Using Bayes factor hypothesis testing in neuroscience to establish evidence of absence. Nature Neuroscience, 23(7), 788–799. https://doi.org/10.1038/s41593-020-0660-4
- Nuzzo, R. (2014). Scientific method: Statistical errors. Nature, 506(7487), 150–152. https://doi.org/10.1038/506150a
- Wagenmakers, E.-J., Dutilh, G., & Sarafoglou, A. (2018). The Creativity-Verification Cycle in Psychological Science: New Methods to Combat Old Idols. Perspectives on Psychological Science, 13(4), 418–427. https://doi.org/10.1177/1745691618771357
- Abadie, A. (2020). Statistical Nonsignificance in Empirical Economics. American Economic Review: Insights, 2(2), 193–208. https://doi.org/10.1257/aeri.20190252
- Gelman, A., & Higgs, M. (2025). Interrogating the “cargo cult science” metaphor. Theory and Society, 54(2), 197–207. https://doi.org/10.1007/s11186-025-09614-6
- Harmon-Jones, E., Harmon-Jones, C., Amodio, D. M., Gable, P. A., & Schmeichel, B. J. (2025). Valid replications require valid methods: Recommendations for best methodological practices with lab experiments. Motivation Science, 11(3), 235–245. https://doi.org/10.1037/mot0000398
- Lengersdorff, L. L., & Lamm, C. (2025). With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests. Advances in Methods and Practices in Psychological Science, 8(1). https://doi.org/10.1177/25152459241296397
Philosophy of science
Approaches to assess the reliability of scientific theories, reasoning, and methods attempting to understand its ability to make predictions about the natural and social world. Introduces how differing philosophies (positivist, post-positivist, constructivist, etc.) influence what scientists consider valid evidence and how open science challenges some traditional norms.
- Bennett, E. A. (2021). Open Science From a Qualitative, Feminist Perspective: Epistemological Dogmas and a Call for Critical Examination. Psychology of Women Quarterly, 45(4), 448–456. https://doi.org/10.1177/03616843211036460
- Class, B., de Bruyne, M., Wuillemin, C., Donzé, D., & Claivaz, J.-B. (2021). Towards Open Science for the Qualitative Researcher: From a Positivist to an Open Interpretation. International Journal of Qualitative Methods, 20. https://doi.org/10.1177/16094069211034641
- Feest, U., & Devezer, B. (2025). Toward a more accurate notion of exploratory research (and why it matters). PhilSci Archive. https://philsci-archive.pitt.edu/24482/
- Field, S. M., & Pownall, M. (2025). Subjectivity is a Feature, not a Flaw: A Call to Unsilence the Human Element in Science. https://doi.org/10.31219/osf.io/ga5fb_v1
- Guest, O., & Martin, A. E. (2021). How Computational Modeling Can Force Theory Building in Psychological Science. Perspectives on Psychological Science, 16(4), 789–802. https://doi.org/10.1177/1745691620970585
- Guest, O. (2024). What Makes a Good Theory, and How Do We Make a Theory Good? Computational Brain & Behavior, 7(4), 508–522. https://doi.org/10.1007/s42113-023-00193-2
- Hutmacher, F., & Franz, D. J. (2025). Approaching psychology’s current crises by exploring the vagueness of psychological concepts: Recommendations for advancing the discipline. American Psychologist, 80(2), 220–231. https://doi.org/10.1037/amp0001300
- Jamieson, R. K., & Pexman, P. M. (2020). Moving beyond 20 questions: We (still) need stronger psychological theory. Canadian Psychology / Psychologie Canadienne, 61(4), 273–280. https://doi.org/10.1037/cap0000223
- Leonelli, S. (2022). Open Science and Epistemic Diversity: Friends or Foes? Philosophy of Science, 89(5), 991–1001. https://doi.org/10.1017/psa.2022.45
- Leonelli, S. (2023). Philosophy of open science. Cambridge University Press. http://philsci-archive.pitt.edu/id/eprint/21986
- Mackenzie, N., & Knipe, S. (2006). Research dilemmas: Paradigms, methods and methodology. Issues in Educational Research, 16(2), 193–205. http://www.iier.org.au/iier16/mackenzie.html
- Malich, L., & Rehmann-Sutter, C. (2022). Metascience Is Not Enough – A Plea for Psychological Humanities in the Wake of the Replication Crisis. Review of General Psychology, 26(2), 261–273. https://doi.org/10.1177/10892680221083876
- Mayrhofer, R., Büchner, I. C., & Hevesi, J. (2024). The quantitative paradigm and the nature of the human mind. The replication crisis as an epistemological crisis of quantitative psychology in view of the ontic nature of the psyche. Frontiers in Psychology, 15. https://doi.org/10.3389/fpsyg.2024.1390233
- Meehl, P. E. (1967). Theory-Testing in Psychology and Physics: A Methodological Paradox. Philosophy of Science, 34(2), 103–115. https://doi.org/10.1086/288135
- Pownall, M. (2024). Is replication possible in qualitative research? A response to Makel et al. (2022). Educational Research and Evaluation, 29(1–2), 104–110. https://doi.org/10.1080/13803611.2024.2314526
- Press, C., Yon, D., & Heyes, C. (2022). Building better theories. Current Biology, 32(1), R13–R17. https://doi.org/10.1016/j.cub.2021.11.027
- Steltenpohl, C. N., Lustick, H., Meyer, M. S., Lee, L. E., Stegenga, S. M., Standiford Reyes, L., & Renbarger, R. L. (2023). Rethinking Transparency and Rigor from a Qualitative Open Science Perspective. Journal of Trial and Error, 4(1), 47–59. KB. https://doi.org/10.36850/mr7
- van Rooij, I. (2022). Psychological models and their distractors. Nature Reviews Psychology, 1(3), 127–128. https://doi.org/10.1038/s44159-022-00031-5
- Yarkoni, T. (2020). The generalizability crisis. Behavioral and Brain Sciences, 45. https://doi.org/10.1017/S0140525X20001685
- Lakatos, I. (1978). The Methodology of Scientific Research Programmes. https://doi.org/10.1017/CBO9780511621123
Questionable measurement practices (QMPs), validity & reliability issues.
The quality of our measures impacts the validity of our results, and offers another avenue for us to address potential questionable practices. Examines how measurement choices shape the credibility of findings. Addresses Questionable Measurement Practices (QMPs) like ad-hoc scale trimming, unvalidated instruments, poor reliability reporting, ignored measurement invariance, and their impact on construct validity, reliability, and generalizability.
- Flake, J. K., & Fried, E. I. (2020). Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them. Advances in Methods and Practices in Psychological Science, 3(4), 456–465. https://doi.org/10.1177/2515245920952393
- Flake, J. K., Pek, J., & Hehman, E. (2017). Construct Validation in Social and Personality Research. Social Psychological and Personality Science, 8(4), 370–378. https://doi.org/10.1177/1948550617693063
- Hutmacher, F., & Franz, D. J. (2025). Approaching psychology’s current crises by exploring the vagueness of psychological concepts: Recommendations for advancing the discipline. American Psychologist, 80(2), 220–231. https://doi.org/10.1037/amp0001300
- Hussey, I., & Hughes, S. (2020). Hidden Invalidity Among 15 Commonly Used Measures in Social and Personality Psychology. Advances in Methods and Practices in Psychological Science, 3(2), 166–184. https://doi.org/10.1177/2515245919882903
- Parsons, S. (2022). Exploring reliability heterogeneity with multiverse analyses: Data processing decisions unpredictably influence measurement reliability. Meta-Psychology, 6. https://doi.org/10.15626/MP.2020.2577
- Parsons, S., Kruijt, A.-W., & Fox, E. (2019). Psychological Science Needs a Standard Practice of Reporting the Reliability of Cognitive-Behavioral Measurements. Advances in Methods and Practices in Psychological Science, 2(4), 378–395. https://doi.org/10.1177/2515245919879695
- Rodebaugh, T. L., Scullin, R. B., Langer, J. K., Dixon, D. J., Huppert, J. D., Bernstein, A., Zvielli, A., & Lenze, E. J. (2016). Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias. Journal of Abnormal Psychology, 125(6), 840–851. https://doi.org/10.1037/abn0000184
- Heyman, T., Pronizius, E., Lewis, S. C., Acar, O. A., Adamkovič, M., Ambrosini, E., Antfolk, J., Barzykowski, K., Baskin, E., Batres, C., Boucher, L., Boudesseul, J., Brandstätter, E., Collins, W. M., Filipović Ðurđević, D., Egan, C., Era, V., Ferreira, P., Fini, C., … Buchanan, E. M. (2025). Crowdsourcing multiverse analyses to explore the impact of different data-processing and analysis decisions: A tutorial. Psychological Methods. https://doi.org/10.1037/met0000770
Research design, sampling methods, & its implications for inferences.
How design choices and sampling strategies shape bias, precision, and generalizability. Includes threats to validity (internal/external), power and sample-size planning, selection bias, clustering/design effects, weighting, and transparent reporting/preregistration. Design and sampling decisions determine the credibility and scope of statistical inference. This sub-cluster emphasizes adequate power and sample-size planning (e.g., safeguard power), transparent pre-analysis planning to constrain researcher degrees of freedom, and rigorous, valid methods as prerequisites for meaningful replication—reducing bias, increasing precision, and improving generalizability across lab and field work.
- Gervais, W. M., Jewell, J. A., Najle, M. B., & Ng, B. K. L. (2015). A Powerful Nudge? Presenting Calculable Consequences of Underpowered Research Shifts Incentives Toward Adequately Powered Designs. Social Psychological and Personality Science, 6(7), 847–854. https://doi.org/10.1177/1948550615584199
- Harmon-Jones, E., Harmon-Jones, C., Amodio, D. M., Gable, P. A., & Schmeichel, B. J. (2025). Valid replications require valid methods: Recommendations for best methodological practices with lab experiments. Motivation Science, 11(3), 235–245. https://doi.org/10.1037/mot0000398
- Perugini, M., Gallucci, M., & Costantini, G. (2014). Safeguard Power as a Protection Against Imprecise Power Estimates. Perspectives on Psychological Science, 9(3), 319–332. https://doi.org/10.1177/1745691614528519
- Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M., Bakker, M., van Aert, R. C. M., & van Assen, M. A. L. M. (2016). Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking. Frontiers in Psychology, 7. https://doi.org/10.3389/fpsyg.2016.01832
The logic of null hypothesis testing, p-values, Type I and II errors (and when and why they might happen).
Frequentist statistics are typically the default in quantitative research. They come with certain assumptions and implications, as well as often being misinterpreted. Frequentist statistics are typically the default in much quantitative research, but they are often misinterpreted. This sub-cluster clarifies the logic of NHST, the meaning of p-values, and when and why Type I and Type II errors arise, extending to Type S and Type M errors. It links these to design and power choices and outlines practical steps for better inference and reporting.
- Banerjee, A., Chitnis, U., Jadhav, S., Bhawalkar, J., & Chaudhury, S. (2009). Hypothesis testing, type I and type II errors. Industrial Psychiatry Journal, 18(2), 127. https://doi.org/10.4103%2F0972-6748.62274
- Bird, A. (2021). Understanding the Replication Crisis as a Base Rate Fallacy. The British Journal for the Philosophy of Science, 72(4), 965–993. https://doi.org/10.1093/bjps/axy051
- Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003. https://doi.org/10.1037/0003-066X.49.12.997
- Rubin, M. (2024). Type I Error Rates are Not Usually Inflated. Journal of Trial and Error, 4(2). KB. https://doi.org/10.36850/4d35-44bd
- Gelman, A., & Carlin, J. (2014). Beyond Power Calculations. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642
- Gelman, A., & Higgs, M. (2025). Interrogating the “cargo cult science” metaphor. Theory and Society, 54(2), 197–207. https://doi.org/10.1007/s11186-025-09614-6
- Gigerenzer, G., Krauss, S., & Vitouch, O. (2004). The Null Ritual: What You Always Wanted to Know About Significance Testing but Were Afraid to Ask. The SAGE Handbook of Quantitative Methodology for the Social Sciences, 392–409. https://doi.org/10.4135/9781412986311.n21
- Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33(5), 587–606. https://doi.org/10.1016/j.socec.2004.09.033
- Lakens, D. Improving your statistical inferences. Online course. https://www.coursera.org/learn/statistical-inferences
- Lengersdorff, L. L., & Lamm, C. (2025). With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests. Advances in Methods and Practices in Psychological Science, 8(1). https://doi.org/10.1177/25152459241296397
- Rubin, M. (2024). Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses. Methods in Psychology, 10, 100140. https://doi.org/10.1016/j.metip.2024.100140
- Wasserstein, R. L., & Lazar, N. A. (2016). The ASA Statement on p -Values: Context, Process, and Purpose. The American Statistician, 70(2), 129–133. https://doi.org/10.1080/00031305.2016.1154108