When sharing a dataset or code, use the assigned DOI (from the data repository) and add this to your data availability statement at the end of the paper (similar to the acknowledgement section). It is important to also cite your dataset in the references themselves, as only the citations in the reference section will contribute to citation counts. Data citation is important because it facilitates access, transparency and potentially reproducibility, reuse, and credit for researchers. It also provides recognition and visibility for the repositories that share data.

You can find examples of these statements in the publishers’ (research data) author policies.

Data and code availability statement examples

Using the Digital Object Identifier (DOI): “The data that support the findings of this study are openly available in [repository name] at http://doi.org/[doi].”

If no DOI is issued: “The data that support the findings of this study are openly available in [repository name] at [URL], reference number [reference number].”

When there is an embargo period you can reserve your DOI and still include a reference to the dataset in your paper: “The data that support the findings will be available in [repository name] at [URL / DOI] following a [6 month] embargo from the date of publication to allow for the commercialisation of research findings.”

When data cannot be made available: “Restrictions apply to the data that support the findings of this study. [Explain nature of restrictions, for example, if the data contains information that could compromise the privacy of research participants] Data are available upon reasonable request by contacting [name and contact details] and with permission of [third party name].”

When code is shared: «Data and code to reproduce the results shown in the paper can be obtained from The Turing Way (2023) at Zenodo (link) and GitHub (link). We used R version 4.2.2 and the following R packages: ggplot2 (Wickham 2016).»

How to cite a dataset in a book or conference paper?

Citations of original data are usually included in the «References» or «Bibliography» section, and it is also important to mention them in the text when presenting the data, clearly indicating that they are works or data that have been self-generated.

How to cite your datasets and DMP in your doctoral thesis?

You can create the section «Open Research«.

DATA AVAILABILITY

The data supporting the findings of this study are available in [Repository Name], [Repository URL], with the unique identifier [DOI or dataset identifier]. 

DMP AVAILABILITY

The Data Management Plan (DMP) associated with this thesis is available at [Repository], [DMP URL]. This document describes the procedures for data management during and after the research project, including data collection, documentation, storage, and preservation.

Questions?

For more information and assistance, contact Documentation Centre and Library.


Source:

Citing Research Objects (The Turing Way). https://the-turing-way.netlify.app/communication/citable/citable-cite#cm-citable-cite-data


Last updated: 11/06/2025