6 Data and File Management

FORRT relies on open, accessible, and well-organized data and documentation to support collaboration across its distributed, volunteer-based teams. This chapter outlines best practices for managing files, documentation, and any data generated through FORRT projects.

6.1 File Storage and Collaboration

All files should be stored in centralized, team-specific folders using approved cloud-based services. Commonly used platforms include: - Google Drive (for editable documents, slides, planning spreadsheets) - OSF (for archival storage, public access, and preprints) - GitHub (for websites, lesson plans, or code)

6.1.1 Principles

Use shared team folders, not personal accounts.
Share links instead of downloading/uploading versions to avoid version conflicts.
Name files and folders clearly (e.g., “FORRT_ReplicationHub_WorkshopPlanning_2025-04”).
Keep public and private files clearly separated and labeled.

[INSERT LINK TO CURRENT SHARED FOLDER DIRECTORY or ACCESS REQUEST INFO]

[OPEN QUESTION: Should we designate a lightweight “documentation steward” role in each team?]

6.2 Naming and Version Control

To make files easier to find and track: - Use clear, descriptive titles with project names and dates. - Add version numbers or initials for clarity when appropriate. - Use platform versioning (e.g., Google Drive revision history, GitHub commits) rather than creating duplicate copies.

Templates for file names and folders are available here:
[INSERT LINK TO NAMING GUIDELINES OR TEMPLATE]

6.3 Data Management for Projects

Most FORRT work is educational or resource-based and does not involve sensitive data. However, when data is collected (e.g., through surveys, interviews, analytics), teams must ensure ethical handling and documentation.

6.3.1 What to Consider

What data is being collected (e.g., feedback, demographic info)?
How will it be stored securely?
Who has access?
Will it be shared, published, or deleted?

A simple Data Management Plan (DMP) template is available and should be used for any project involving participant or user data. [INSERT LINK TO DMP TEMPLATE]

6.3.2 Roles and Responsibilities

Team/project leads are responsible for coordinating data handling practices.
If unsure, consult Team Ethics or the Steering Council for guidance.

[OPEN QUESTION: Should we create a central log or register of ongoing data-collecting projects across FORRT?]

6.4 Open Access and Licensing

All final outputs (documents, slides, lesson plans, tools) should be: - Shared publicly via OSF, GitHub, or the FORRT website - Accompanied by a clear usage license (typically CC BY-NC-SA 4.0)

Where possible, include: - A README file explaining what the resource is, how it was created, and how to use it - Metadata (title, creators, keywords, version, date, link to related projects)

For code and software, use an OSI-approved open source license (e.g., MIT, GPL) and host on GitHub or similar.

[INSERT LINK TO LICENSEING GUIDANCE OR METADATA CHECKLIST]

6.5 Backups and Archiving

Teams are encouraged to: - Regularly review shared folders to clean up outdated drafts or duplicates - Back up critical files in at least one additional location (e.g., OSF)

[OPEN QUESTION: Should we implement a quarterly “digital housekeeping” check-in with each team?]

6.6 Summary

Area	Practice / Tool	Responsible Party
File storage	Google Drive, OSF, GitHub	All contributors
Naming/version control	Clear titles + platform versioning	All contributors
Data collection projects	Fill out simple DMP	Project/team leads
Licensing and access	CC BY-NC-SA for docs, OSI license for code	Team leads
Archiving	Publish to OSF, tag final versions	Team leads or stewards

[INSERT LINKS: DMP template, license guidance, metadata checklist (could be derived from the below), shared folders]

6.7 iRise Minimal metadata standards

If we want to retain / adopt those, they should probably go into a separate file?

All data outputs will be complemented with metadata, as outlined in the DMP. The following elements are required:

Title: title describing the data output at hand.
Principal Investigator or Creator: the main person(s) responsible for the intellectual content, with affiliation(s).
Contributor(s): any other person(s) who contributed to the data output with affiliation(s).
Funding: funding source of the project leading to the data output (iRISE and additional funding sources must be acknowledged here).
References and citations: Citations to relevant work or other objects/material leading to the data output or using the data output. Only cite those articles or material that are important for the data output to be reusable and interpretable. Specifically, if applicable, cite any software or material needed to interact with the data.
Summary | Description: A textual description of the aims of data collection and a summary of the data output itself (in the form of a short abstract).
Keywords: List of relevant keywords making the metadata findable.
Coverage: when and where was the data collection - or the project - started and when was it finalized.
Date of publication: Date of data deposition (first – and new versions)
Unit of observation
Population: information on the population of interest represented or targeted in the data output.
Data type and format: information on the type and format of the data collected.
Sampling and weighting: information on whether any sampling or weighting was used in the data acquisition, and if so, which type or method of sampling and/or weighting was used.
Mode of Collection: information on how the data was collected, on the method used for data collection.
DOI
Licenses and restrictions
Ethical considerations: if ethical approval was needed and acquired, the metadata should link or cite the ethics approval.
Description of variables: if possible, this should be done in a separate code book or data dictionary.