Preserving research data
Preserving research data is more complex than just saving it on a hard drive or server. In fact, preserving digital data is more complex than storing paper as digital data degrades faster and you can't pick it up and read it.
What is data preservation?
Data preservation involves converting your data to a preservation format and storing it within a special preservation repository where it will be actively stewarded—managed and migrated as formats change.
When you deposit a tabular data file (e.g., CSV, Excel, SPSS, RData) in the CORA.RDR, the repository will automatically convert your data into a format that increases the likelihood that it can be preserved.
Why should I keep data?
You are legally required to keep research data for a period of time. It is also a good idea because you may want to use the data again in the future, promote it for use by other researchers, or need it to verify your results. Some publishers ask for access to data.
Data retention can be determined by a number of factors. Some things to consider include contractual requirements of funding bodies, legal requirements to retain or discard data, as well as best practices in your field.
Which data should be preserved?
It is rarely feasible to preserve all data generated during a research project. Focus instead on data that is unique or difficult to replicate as data fit for preservation. If you are receiving federal funding then you also need to preserve the data which is needed to validate your research findings.
Some criteria you can use when selecting the data for long-term preservation
- Type of data (raw, processed) and ease of generation
- Relevance of content to others
- Ease of reuse of the format by others
- Data linked to a publication
- Investigation verification
- Time available
- Available financial resources
Note
You do not need to keep everything or all versions of your data. You only need to retain the data that verifies your research findings.
What can I do to ensure my data can be preserved?
The most important step you can take is to deposit and share your data in a data repository. Most repositories will convert your data to sustainable formats.
During the planning stage of your project, you can also identify which of your data are stored in proprietary file formats and migrate the data to more sustainable file formats before depositing it into a repository. Find more information on recommended preservation formats at the ICAC Sustainability of Digital Formats.
Also, try to preserve research data in a repository which provides data curation services, not just preservation services. Curated data is more valuable, easier to reuse, easier to locate, and more highly cited. Curation activities include verifying the integrity and quality of data, migrating data formats, and creating descriptive records for data.
For how long will the data be preserved?
The ICAC Protocol for the management of research data states that research data must be kept for (at least) 10 years after publication or public release.
Data preservation involves
Data preservation involves data curation activities such data integrity checks, format migrations, and the creation of descriptive records.
To reiterate
Saving data on hard drives/servers/etc. is not data preservation - it's data storage.
Last updated: 18/07/2023