Focus
Type

2 Conceptual and Statistical Knowledge

7 sub-clusters · 82 references

Attainment of a grounding in fundamental statistics, measurement, and its implications encompassing conceptual knowledge, application, interpretation and communication of statistical analyses. There are 8 sub-clusters which aim to further parse the learning and teaching process:

Effect sizes, statistical power, simulations, & confidence intervals. 19 / 19

Statistics are more than p-values and we need to use other benchmarks to determine the statistical and practical relevance of an effect. Emphasizes effect size, confidence intervals, power, and simulations to design adequately powered studies and communicate practical significance.

evidence Editorial
Sample size estimation revisited
This editorial examines the prevalence and reproducibility of sample size estimations within the Journal of Sports Sciences, revealing that only a small minority of studies provide sufficient detail for calculation reproduction. It highlights a critical gap between established editorial guidelines and the actual reporting practices of researchers in the field.
practice/tools Editorial
Power, precision, and sample size estimation in sport and exercise science research
This resource offers a technical guide for sport and exercise scientists on determining sample sizes using both frequentist power and precision-based approaches. It acts as a practical primer for researchers to justify their study designs and align with rigorous statistical standards during the submission process.
evidence Paper
Quantitative Political Science Research Is Greatly Underpowered
This large-scale meta-research study provides empirical evidence that quantitative political science research is severely underpowered, with a median power of only 10% across thousands of tests. The findings demonstrate that only a small fraction of tests in the discipline meet the standard 80% power threshold required to detect consensus effects.
advocacy Paper
Behavioural science is unlikely to change the world without a heterogeneity revolution
This article argues that the impact of behavioral science on real-world problems is hindered by a neglect of treatment effect heterogeneity. It advocates for a shift in research priorities toward understanding how and why effects vary across different contexts and populations, proposing a framework to improve the generalizability of findings.
practice/tools Paper
Power Analysis and Effect Size in Mixed Effects Models: A Tutorial
This tutorial addresses the difficulty of conducting power analysis for experimental designs that include both participant and stimulus samples, common in cognitive psychology. It provides researchers with practical methods and literature reviews to accurately estimate power and effect sizes when using mixed-effects models.
practice/tools Preprint
Accuracy in Parameter Estimation and Simulation Approaches for Sample Size Planning with Multiple Stimuli
This resource introduces simulation-based approaches and Accuracy in Parameter Estimation (AIPE) as alternatives for sample size planning in research studies with multiple stimuli. It provides tools to determine necessary sample sizes for precise parameter estimation when traditional power formulas are insufficient or inapplicable to complex designs.
evidence Paper
Power failure: why small sample size undermines the reliability of neuroscience
This study presents a meta-research analysis quantifying the prevalence of low statistical power across the neuroscience literature and its role in undermining reproducibility. It demonstrates how underpowered studies lead to inflated effect sizes and waste resources, calling for systemic changes in how neuroscience research is conducted and reported.
Caldwell, A. R., Lakens, D., Parlett‑Pelleriti, C. M., Prochilo, G., & Aust, F. (2022). Power analysis with Superpower. https://aaroncaldwell.us/SuperpowerBook/
practice/tools Paper
Understanding Mixed-Effects Models Through Data Simulation
This tutorial provides a practical guide to using data simulation to better understand and interpret linear mixed-effects models that include random effects for both subjects and stimuli. By walking through R code and parameter interpretation, it helps researchers build intuition for complex models and correctly apply them to their own experimental data.
overview Paper
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations
This resource clarifies common misconceptions regarding frequentist statistical indicators by identifying and correcting twenty prevalent misinterpretations of p-values, confidence intervals, and power. It serves as a rigorous pedagogical guide to help researchers avoid incorrect shortcut definitions that lead to invalid scientific conclusions.
practice/tools Paper
Conducting Simulation Studies in the R Programming Environment
This resource provides a practical tutorial for using the R programming environment to conduct simulation studies, making these techniques accessible to researchers without advanced programming backgrounds. It includes annotated code to help users estimate statistical power and assess the appropriateness of various analytical methods for their specific research questions.
overview Paper
Heterogeneity in effect size estimates
This resource proposes a framework that decomposes heterogeneity in effect sizes into three distinct sources: population, design, and analytical variation. It provides a theoretical foundation for understanding how these different forms of uncertainty limit the generalizability of research findings and affect the cumulative probability that a tested hypothesis is true.
practice/tools Paper
Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs
This primer provides step-by-step instructions for calculating and reporting various effect size measures for t-tests and ANOVAs, specifically distinguishing between metrics like Cohen’s d and partial eta-squared. It emphasizes how transparent effect size reporting directly enables cumulative science by facilitating accurate a-priori power analyses and the inclusion of findings in meta-analyses.
overview Paper
Sample Size Justification
This article details six different approaches for justifying sample sizes in quantitative research, moving beyond simple power analysis to include strategies based on accuracy planning, resource constraints, and point estimates. It provides researchers with a structured decision-making framework and standardized vocabulary to transparently communicate the rationale for their data collection plans.
evidence Paper
With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests
Employing a survey design with truth-telling incentives, this paper provides empirical data on the widespread prevalence of questionable research practices among psychologists. It reveals that researchers are significantly more likely to admit to behaviors they perceive as defensible, providing insight into the normalization of problematic methodologies within the discipline.
practice/tools Paper
Reporting effect sizes in original psychological research: A discussion and tutorial.
This tutorial offers practical guidance for psychological researchers on the reporting and interpretation of effect sizes and their associated confidence intervals. It specifically emphasizes the importance of unstandardized effect sizes and provides recommendations for selecting measures that best address specific research questions.
practice/tools Paper
Safeguard Power as a Protection Against Imprecise Power Estimates
This article introduces 'safeguard power analysis,' a practical method for sample size planning that accounts for the inherent uncertainty in effect size estimates. By using the lower bound of a confidence interval around an effect size, the tool helps researchers avoid the common problem of designing underpowered studies based on potentially inflated initial results.
critique Paper
Evaluating Research in Personality and Social Psychology: Considerations of Statistical Power and Concerns About False Findings
This article critiques the application of False Finding Rate (FFR) calculations as a primary criterion for evaluating the quality of research in personality and social psychology. It argues that the assumptions underlying these power-based evaluations often fail to reflect the practical realities of the discipline, potentially leading to the unfair dismissal of valid research.
evidence Paper
On the importance of modeling the invisible world of underlying effect sizes
This resource uses formal modeling and simulations to demonstrate that headline replication rates cannot be meaningfully interpreted without considering the underlying distribution of true effect sizes and statistical power. It provides a meta-research framework for understanding how observed replication failures can emerge from the mathematical properties of original study designs rather than solely from questionable research practices.
Exploratory and confirmatory analyses 8 / 8

Confirmatory analyses test a priori hypotheses against a pre-specified analysis plan (ideally preregistered/Registered Report); any deviations are documented. Exploratory analyses probe patterns, generate hypotheses, and build models after seeing the data.

advocacy Book
The Seven Deadly Sins of Psychology
This work identifies and analyzes systemic flaws in psychological science, such as publication bias and lack of transparency, that contribute to the replication crisis. It makes a strong case for institutional reform and the adoption of open science practices, such as Registered Reports, to improve the reliability of the field.
Feest, U., & Devezer, B. (2025). Toward a more accurate notion of exploratory research (and why it matters). PhilSci Archive. https://philsci-archive.pitt.edu/24482/
critique Paper
A critique of using the labels confirmatory and exploratory in modern psychological research
This paper critiques the binary categorization of research as either exploratory or confirmatory, arguing that these labels are too simplistic for modern psychological research involving complex statistical models. It highlights how these terms can mask the nuanced relationship between theory and data analysis, potentially obstructing methodological progress.
Lin, W., & Green, D. P. (2016). Standard Operating Procedures: A Safety Net for Pre-Analysis Plans. Political Science and Politics, 49(3), 495–500. https://doi.org/10.1017/S1049096516000810
critique Paper
Exploratory hypothesis tests can be more compelling than confirmatory hypothesis tests
This paper challenges the prevailing hierarchy that favors confirmatory testing over exploratory testing, arguing that the latter can often produce more compelling scientific insights. It provides a theoretical counterpoint to the idea that preregistration is the primary determinant of research quality or certainty.
critique Paper
Arrested Theory Development: The Misguided Distinction Between Exploratory and Confirmatory Research
This resource argues that psychology’s replicability crisis stems from "flexible theories" rather than a failure to distinguish between exploration and confirmation. It critiques current trends that prioritize methodological fixes like preregistration over the fundamental need for developing rigorous, "hard to vary" theories.
advocacy Paper
The Creativity-Verification Cycle in Psychological Science: New Methods to Combat Old Idols
This article advocates for the adoption of preregistration in psychological science as a necessary safeguard against pervasive cognitive biases like hindsight and confirmation bias. It argues for a clear structural separation between the 'creativity' of exploratory data analysis and the 'verification' of confirmatory hypothesis testing.
advocacy Paper
An Agenda for Purely Confirmatory Research
This resource advocates for the adoption of purely confirmatory research designs to prevent researchers from fine-tuning analyses to fit observed data. It highlights how the lack of pre-commitment to specific statistical tests undermines the validity of research claims in psychology.
Limitations and benefits of NHST, Bayesian & Likelihood approaches. 11 / 11

Next to frequentist statistics, there are other quantitative approaches, each with different assumptions and goals. This subcluster summarizes benefits and limitations of each one.

advocacy Paper
The New Statistics
This publication advocates for the adoption of "the new statistics," urging researchers to move away from null-hypothesis significance testing in favor of estimation and effect size reporting. It presents a clear case for research integrity reforms, including the prespecification of studies and the active encouragement of replication to improve literature reliability.
practice/tools Paper
How to become a Bayesian in eight easy steps: An annotated reading list
This resource provides a curated and annotated reading list designed to guide researchers through the transition from frequentist to Bayesian statistical thinking. It offers a structured pathway for self-study by identifying foundational texts and explaining their significance in mastering Bayesian inference.
overview Paper
Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations
This resource clarifies common misconceptions regarding frequentist statistical indicators by identifying and correcting twenty prevalent misinterpretations of p-values, confidence intervals, and power. It serves as a rigorous pedagogical guide to help researchers avoid incorrect shortcut definitions that lead to invalid scientific conclusions.
overview Paper
Bayes Factors
This paper provides a comprehensive review of the Bayes factor as a tool for quantifying scientific evidence in favor of a hypothesis. It discusses the practical application of Bayesian hypothesis testing across various research contexts and provides guidelines for interpreting the strength of evidence.
practice/tools Paper
Using Bayes factor hypothesis testing in neuroscience to establish evidence of absence
This resource demonstrates how Bayesian hypothesis testing can be applied specifically within neuroscience to distinguish between inconclusive results and genuine evidence for the absence of an effect. It provides a practical alternative to frequentist methods, which are inherently unable to provide statistical evidence in support of a null hypothesis.
overview Paper
Scientific method: Statistical errors
This resource provides an accessible overview of the common misinterpretations of p-values and how the rigid reliance on statistical significance thresholds fuels the reproducibility crisis. It explains the mathematical vulnerability of 'near-significant' results and suggests moving toward more nuanced statistical reporting that avoids binary thinking.
advocacy Paper
The Creativity-Verification Cycle in Psychological Science: New Methods to Combat Old Idols
This article advocates for the adoption of preregistration in psychological science as a necessary safeguard against pervasive cognitive biases like hindsight and confirmation bias. It argues for a clear structural separation between the 'creativity' of exploratory data analysis and the 'verification' of confirmatory hypothesis testing.
critique Paper
Statistical Nonsignificance in Empirical Economics
This article critiques the standard practice in empirical economics of prioritizing statistically significant rejections over non-significant findings. It demonstrates that in the context of large economic datasets, the failure to reject a point null is often more scientifically informative than a rejection, challenging the traditional hierarchy of evidence.
policies Paper
Interrogating the “cargo cult science” metaphor
The Bonn PRINTEGER Statement provides a set of guidelines for research organizations to strengthen integrity by focusing on institutional responsibilities and the daily work environment. It contributes actionable advice on how management and governance can be adapted to proactively address the ethical challenges researchers face on the work-floor.
practice/tools Paper
Valid replications require valid methods: Recommendations for best methodological practices with lab experiments.
This resource provides actionable methodological recommendations for conducting lab experiments to ensure they serve as a solid foundation for valid replications. It highlights specific practices in experimental design and implementation that are essential for producing reliable and reproducible findings.
evidence Paper
With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests
Employing a survey design with truth-telling incentives, this paper provides empirical data on the widespread prevalence of questionable research practices among psychologists. It reveals that researchers are significantly more likely to admit to behaviors they perceive as defensible, providing insight into the normalization of problematic methodologies within the discipline.
Philosophy of science 20 / 20

Approaches to assess the reliability of scientific theories, reasoning, and methods attempting to understand its ability to make predictions about the natural and social world. Introduces how differing philosophies (positivist, post-positivist, constructivist, etc.) influence what scientists consider valid evidence and how open science challenges some traditional norms.

critique Paper
Open Science From a Qualitative, Feminist Perspective: Epistemological Dogmas and a Call for Critical Examination
This article evaluates the alignment between open science frameworks and the priorities of qualitative and feminist research within the field of psychology. It specifically questions whether existing open science dogmas inadvertently marginalize transgressive research methods and calls for a critical examination of how these frameworks impact radical inquiry.
practice/tools Paper
Towards Open Science for the Qualitative Researcher: From a Positivist to an Open Interpretation
This resource provides a practical reflection on data handling and pseudonymization in qualitative research, detailed through a case study of custom software development. It bridges technical implementation with epistemological inquiry to demonstrate how open research data guidelines can be successfully adapted to qualitative workflows.
Feest, U., & Devezer, B. (2025). Toward a more accurate notion of exploratory research (and why it matters). PhilSci Archive. https://philsci-archive.pitt.edu/24482/
advocacy Preprint
Subjectivity is a Feature, not a Flaw: A Call to Unsilence the Human Element in Science
This resource advocates for the recognition of researcher subjectivity as an inherent and valuable component of science rather than a contaminant to be purged. It challenges the traditional myth of the detached scientist and encourages the explicit use of reflexivity to enhance scientific integrity.
advocacy Paper
How Computational Modeling Can Force Theory Building in Psychological Science
This article promotes the adoption of computational modeling as a vital tool for advancing theory building within psychological science. It demonstrates how formalizing theories into models forces researchers to clarify vague intuitions and specify assumptions that often remain unexamined in purely verbal theories.
overview Paper
What Makes a Good Theory, and How Do We Make a Theory Good?
This resource proposes a formal ontology of criteria, known as a metatheoretical calculus, to evaluate the quality and robustness of scientific theories. It specifically outlines categories such as metaphysical commitment and discursive survival to help researchers move beyond vague assessments and toward rigorous theoretical adjudication.
critique Paper
Approaching psychology’s current crises by exploring the vagueness of psychological concepts: Recommendations for advancing the discipline.
This resource argues that the replication, theory, and universality crises in psychology are fundamentally linked to the vagueness of psychological concepts. It suggests that advancing the discipline requires a focus on theoretical and philosophical refinement rather than just methodological or statistical changes.
advocacy Paper
Moving beyond 20 questions: We (still) need stronger psychological theory.
This resource argues that psychology continues to struggle with fragmented findings and emphasizes the persistent need for robust, unifying theories to replace the "20 questions" style of empirical research. It advocates for a shift in focus from isolated experimental effects toward the development of comprehensive theoretical frameworks.
critique Paper
Open Science and Epistemic Diversity: Friends or Foes?
This work explores how the current implementation of open science may marginalize diverse research traditions by privileging specific inquiry styles over others. It identifies four reference points—such as local specificity and data provenance—to help open science frameworks better accommodate epistemic diversity.
Leonelli, S. (2023). Philosophy of open science. Cambridge University Press. http://philsci-archive.pitt.edu/id/eprint/21986
Mackenzie, N., & Knipe, S. (2006). Research dilemmas: Paradigms, methods and methodology. Issues in Educational Research, 16(2), 193–205. http://www.iier.org.au/iier16/mackenzie.html
critique Paper
Metascience Is Not Enough – A Plea for Psychological Humanities in the Wake of the Replication Crisis
This article critiques the reliance on metascience as the primary solution to the replication crisis, arguing that it overlooks deep-seated epistemic problems within psychology. It advocates for integrating perspectives from the psychological humanities to address the conceptual and historical complexities that quantitative metascientific approaches may fail to capture.
critique Paper
The quantitative paradigm and the nature of the human mind. The replication crisis as an epistemological crisis of quantitative psychology in view of the ontic nature of the psyche
This paper frames the replication crisis in psychology as a fundamental epistemological mismatch between the complex nature of the human psyche and the quantitative methods used to measure it. It moves beyond statistical explanations to argue that the crisis stems from underlying philosophical and ontological assumptions that remain largely unaddressed in the field.
critique Paper
Theory-Testing in Psychology and Physics: A Methodological Paradox
This seminal paper identifies a methodological paradox where increased experimental precision in psychology, unlike in physics, actually makes theory corroboration more difficult when relying on null hypothesis significance testing. It critiques the logical foundations of how psychological theories are tested, arguing that 'statistical significance' is often an inadequate substitute for genuine theoretical progress.
critique Paper
Is replication <i>possible</i> in qualitative research? A response to Makel et al. (2022)
Serving as a direct rebuttal to advocacy pieces, this response highlights three core areas where the logic of replication conflicts with the goals of qualitative research. It provides a critical perspective on how open research practices developed for quantitative work may not be appropriate for educational or qualitative methodologies.
advocacy Paper
Building better theories
This resource argues that the replication crisis is fundamentally a crisis of theory, advocating for a shift in focus toward more rigorous theory construction and specification. It highlights how strengthening the theoretical foundations of psychological research is essential for creating more robust, falsifiable, and reproducible scientific findings.
critique Paper
Rethinking Transparency and Rigor from a Qualitative Open Science Perspective
This paper critiques the quantitative-centric definition of transparency in open science, arguing that current frameworks do not align with the epistemic goals of qualitative research. It proposes a broader perspective that emphasizes researcher reflexivity and contextual data interpretation as essential components of rigor.
critique Paper
Psychological models and their distractors
This paper critiques the current use of formal models in psychology, arguing that they often serve as 'distractors' that mask a lack of theoretical depth rather than resolving it. It challenges researchers to ensure that their mathematical models are genuinely grounded in coherent psychological theory rather than being used as mere technical window dressing.
Yarkoni, T. (2020). The generalizability crisis. Behavioral and Brain Sciences, 45. https://doi.org/10.1017/S0140525X20001685
Lakatos, I. (1978). The Methodology of Scientific Research Programmes. https://doi.org/10.1017/CBO9780511621123
Questionable measurement practices (QMPs), validity & reliability issues. 8 / 8

The quality of our measures impacts the validity of our results, and offers another avenue for us to address potential questionable practices. Examines how measurement choices shape the credibility of findings. Addresses Questionable Measurement Practices (QMPs) like ad-hoc scale trimming, unvalidated instruments, poor reliability reporting, ignored measurement invariance, and their impact on construct validity, reliability, and generalizability.

practice/tools Paper
Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them
This resource defines and categorizes Questionable Measurement Practices (QMPs), illustrating how hidden decisions in the measurement process can threaten the validity of scientific conclusions. It provides a practical framework for researchers to increase measurement transparency and offers guidance on how to avoid these common pitfalls during study design and reporting.
evidence Paper
Construct Validation in Social and Personality Research
The authors present empirical meta-research by auditing a representative sample of social and personality psychology papers to evaluate the state of construct validation. The study reveals a significant gap between the common use of latent variable measurement and the lack of rigorous, ongoing evidence provided by researchers to justify those measures.
critique Paper
Approaching psychology’s current crises by exploring the vagueness of psychological concepts: Recommendations for advancing the discipline.
This resource argues that the replication, theory, and universality crises in psychology are fundamentally linked to the vagueness of psychological concepts. It suggests that advancing the discipline requires a focus on theoretical and philosophical refinement rather than just methodological or statistical changes.
evidence Paper
Hidden Invalidity Among 15 Commonly Used Measures in Social and Personality Psychology
This study presents empirical evidence of 'hidden invalidity' by showing that widely used psychological scales often fail structural validity tests despite having acceptable internal consistency. By analyzing a uniquely large dataset, it demonstrates that standard metrics like Cronbach's alpha often mask significant psychometric flaws in social and personality psychology measures.
Parsons, S. (2022). Exploring reliability heterogeneity with multiverse analyses: Data processing decisions unpredictably influence measurement reliability. Meta-Psychology, 6. https://doi.org/10.15626/MP.2020.2577
advocacy Paper
Psychological Science Needs a Standard Practice of Reporting the Reliability of Cognitive-Behavioral Measurements
This paper advocates for the establishment of a standard reporting practice for measurement reliability within cognitive-behavioral research to improve the robustness of psychological science. It argues that transparently reporting reliability is a necessary prerequisite for properly evaluating statistical inferences and ensuring that research findings are not merely artifacts of measurement error.
critique Paper
Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias.
This resource critiques the reliance on unreliable behavioral measures in psychopathology research, using attentional bias as a primary example of how poor metrics hinder scientific progress. It highlights the specific threat that measurement error poses to major clinical initiatives, such as the RDoC, which depend on high-reliability measures for individual difference research and mediation analysis.
practice/tools Paper
Crowdsourcing multiverse analyses to explore the impact of different data-processing and analysis decisions: A tutorial.
This resource provides a practical tutorial on implementing multiverse analyses to test the robustness of research findings against various data-processing and analytical choices. It demonstrates how exploring multiple plausible analysis paths can reveal the sensitivity of results to arbitrary decisions, thereby improving the transparency and generalizability of empirical research.
Research design, sampling methods, & its implications for inferences. 4 / 4

How design choices and sampling strategies shape bias, precision, and generalizability. Includes threats to validity (internal/external), power and sample-size planning, selection bias, clustering/design effects, weighting, and transparent reporting/preregistration. Design and sampling decisions determine the credibility and scope of statistical inference. This sub-cluster emphasizes adequate power and sample-size planning (e.g., safeguard power), transparent pre-analysis planning to constrain researcher degrees of freedom, and rigorous, valid methods as prerequisites for meaningful replication—reducing bias, increasing precision, and improving generalizability across lab and field work.

evidence Paper
A Powerful Nudge? Presenting Calculable Consequences of Underpowered Research Shifts Incentives Toward Adequately Powered Designs
This study uses a stylized thought experiment to empirically evaluate how researchers weigh statistical power versus individual productivity in hiring decisions. It demonstrates that explicitly presenting the scientific consequences of underpowered research can shift professional incentives toward favoring more robustly powered experimental designs.
practice/tools Paper
Valid replications require valid methods: Recommendations for best methodological practices with lab experiments.
This resource provides actionable methodological recommendations for conducting lab experiments to ensure they serve as a solid foundation for valid replications. It highlights specific practices in experimental design and implementation that are essential for producing reliable and reproducible findings.
practice/tools Paper
Safeguard Power as a Protection Against Imprecise Power Estimates
This article introduces 'safeguard power analysis,' a practical method for sample size planning that accounts for the inherent uncertainty in effect size estimates. By using the lower bound of a confidence interval around an effect size, the tool helps researchers avoid the common problem of designing underpowered studies based on potentially inflated initial results.
practice/tools Paper
Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking
This resource provides an extensive checklist of 34 specific researcher degrees of freedom that can lead to p-hacking across various stages of the research process. It serves as a practical tool for psychologists to preemptively identify and minimize opportunistic choices during study planning, data collection, analysis, and reporting.
The logic of null hypothesis testing, p-values, Type I and II errors (and when and why they might happen). 12 / 12

Frequentist statistics are typically the default in quantitative research. They come with certain assumptions and implications, as well as often being misinterpreted. Frequentist statistics are typically the default in much quantitative research, but they are often misinterpreted. This sub-cluster clarifies the logic of NHST, the meaning of p-values, and when and why Type I and Type II errors arise, extending to Type S and Type M errors. It links these to design and power choices and outlines practical steps for better inference and reporting.

Banerjee, A., Chitnis, U., Jadhav, S., Bhawalkar, J., & Chaudhury, S. (2009). Hypothesis testing, type I and type II errors. Industrial Psychiatry Journal, 18(2), 127. https://doi.org/10.4103%2F0972-6748.62274
critique Paper
Understanding the Replication Crisis as a Base Rate Fallacy
This paper presents a theoretical critique of the standard narrative that the replication crisis is primarily caused by poor scientific conduct or questionable research practices. It uses the logic of the base rate fallacy to argue that high failure rates in replications are a predictable mathematical outcome in fields that investigate a large proportion of unlikely hypotheses.
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49(12), 997–1003. https://doi.org/10.1037/0003-066X.49.12.997
critique Paper
Type I Error Rates are Not Usually Inflated
This article challenges the conventional wisdom that questionable research practices like p-hacking necessarily inflate Type I error rates. It introduces nuanced distinctions between different types of statistical errors to argue that many criticized practices do not impact the error rates relevant to the researchers' specific hypotheses.
practice/tools Paper
Beyond Power Calculations
This paper proposes a move beyond traditional power analysis by introducing "design calculations" to estimate Type S (sign) and Type M (magnitude) errors. These metrics help researchers understand the risk of obtaining results that are either in the wrong direction or grossly exaggerated in magnitude, particularly in small-sample studies.
policies Paper
Interrogating the “cargo cult science” metaphor
The Bonn PRINTEGER Statement provides a set of guidelines for research organizations to strengthen integrity by focusing on institutional responsibilities and the daily work environment. It contributes actionable advice on how management and governance can be adapted to proactively address the ethical challenges researchers face on the work-floor.
critique Book Chapter
The Null Ritual: What You Always Wanted to Know About Significance Testing but Were Afraid to Ask
This publication critiques the institutionalized "null ritual," which it describes as an incoherent amalgamation of incompatible Fisherian and Neyman-Pearson statistical frameworks. It explains how this ritualized practice suppresses critical thinking and fosters the illusion that statistical significance is a substitute for scientific evidence and theoretical reasoning.
critique Paper
Mindless statistics
This article critiques the "null ritual" prevalent in the social sciences, where statistical procedures are applied mindlessly as a requirement for social group identification rather than scientific inquiry. It highlights how rigid adherence to significance levels leads to collective confusion among researchers and undermines the quality of statistical reasoning in scientific publications.
Lakens, D. Improving your statistical inferences. Online course. https://www.coursera.org/learn/statistical-inferences
evidence Paper
With Low Power Comes Low Credibility? Toward a Principled Critique of Results From Underpowered Tests
Employing a survey design with truth-telling incentives, this paper provides empirical data on the widespread prevalence of questionable research practices among psychologists. It reveals that researchers are significantly more likely to admit to behaviors they perceive as defensible, providing insight into the normalization of problematic methodologies within the discipline.
critique Paper
Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses
This resource highlights a specific logical inconsistency where researchers apply familywise error rate corrections to individual hypothesis tests rather than joint union hypotheses. It argues that this practice leads to inappropriate inferential conclusions and clarifies the intended purpose of alpha-level adjustments.
policies Paper
The ASA Statement on <i>p</i> -Values: Context, Process, and Purpose
This official statement from the American Statistical Association provides six principles to guide the use and interpretation of p-values in scientific research. It serves as a formal policy document intended to improve the transparency and reproducibility of statistical analysis across various disciplines.
Reading List 0
Saved to your reading list! Click the pill to view, export BibTeX, or manage your list.
JUST-OS chatbot (offline)
Chatbot offline — we hope to bring it back soon