Secondary Data Preregistration

By Alexander C. DeHaven, Andrew Hall, Brian Brown, Charles R. Ebersole, Courtney K. Soderberg, David Thomas Mellor, Elliott Kruse, Jerome Olsen, Jessica Kosie, K. D. Valentine, Lorne Campbell, Marjan Bakker, Olmo van den Akker, Pamela Davis-Kean, Rodica I. Damian, Sara J. Weston, Stuart J. Ritchie, Thuy-vy Ngugen, William J. Chopik.

Abstract

Preregistration is the process of specifying project details, such as hypotheses, data collection procedures, and analytical decisions, prior to conducting a study. It is designed to make a clearer distinction between data-driven, exploratory work and a-priori, confirmatory work. Both modes of research are valuable, but are easy to unintentionally conflate. See the Preregistration Revolution for more background and recommendations.

For research that uses existing datasets, there is an increased risk of analysts being biased by preliminary trends in the dataset. However, that risk can be balanced by proper blinding to any summary statistics in the dataset and the use of hold out datasets (where the “training” and “validation” datasets are kept separate from each other). See this page for specific recommendations about “split samples” or “hold out” datasets. Finally, if those procedures are not followed, disclosure of possible biases can inform the researcher and her audience about the proper role any results should have (i.e. the results should be deemed mostly exploratory and ideal for additional confirmation).

This project contains a template for creating your preregistration, designed specifically for research using existing data. In the future, this template will be integrated into the OSF.

Link to resource: https://osf.io/x4gzt/wiki/home/

Type of resources: Reading

Education level(s): Graduate / Professional

Primary user(s):

Subject area(s): Applied Science

Language(s): English