Documentation
Source: Documentation. The Turing Way Community. This illustration is created by Scriberia with The Turing Way community, used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.

Documenting your data means providing information that allows other users to understand and use your data. It is a requirement for open data, as shown in the FAIR principles. Data documentation can take different forms, from a simple text document (often called a README file) to information embedded within the files themselves, or even structured descriptive lists such as a catalogue.

What is a README File?

A README file is usually a text file titled README.txt that should be located at the root of your dataset. Its title indicates that any potential user of your data should consult it before checking any other part of your dataset.

The main README file explains the contents and structure of your dataset, and gives enough information for a potential user to determine whether the data is of interest to them or not. If your dataset requires a codebook, it can be included within it. You can of course also create secondary README files in subfolders to document specific parts of your data.

 

Guidance on the README file for archaeological datasets

Take a look at this guide for the README file for datasets which provide best practices and a template.

Go

What’s in a Data Dictionary?

A data dictionary is used to catalog and communicate the structure and content of data and provides meaningful descriptions for individually named data objects.

Data dictionary contents can vary but typically include some or all of the following:

  • A listing of data objects (names and definitions).
  • Detailed properties of data elements (data type, size, nullability, optionality, indexes).
  • Entity-relationship.
  • Reference data (classification and descriptive domains).
  • Missing data and quality-indicator codes

How Data Dictionaries are used?

Documentation: provide data structure details for users, developers, and other stakeholders

Communication: equip users with a common vocabulary and definitions for shared data, data standards, data flow and exchange, and help developers gage impacts of schema changes

Application Design: help application developers create forms and reports with proper data types and controls, and ensure that navigation is consistent with data relationships

Systems Analysis: enable analysts to understand overall system design and data flow, and to find where data interact with various processes or components

Data Integration: clear definitions of data elements provide the contextual understanding needed when deciding how to map one data system to another, or whether to subset, merge, stack, or transform data for a specific use.

Useful tools

README file template for archaeological datasets

Go

README file template for master's degree Final Project

Go

Data Dictionary Template

Go

Questions?

For more information and assistance, contact Documentation Centre and Library.


Sources:

Describe (Metadata/Documentation) (U.S. Geological Survey): https://www.usgs.gov/data-management/describe-metadatadocumentation

README File (Geneva Graduate Institute): https://libguides.graduateinstitute.ch/rdm/readme


Last updated: 21/02/2024