Documenting data

Information on how to provide good documentation for your data.

Adopting good documentation practices at the start of a project will help ensure the integrity of your datasets. At the end of the project, such documentation will help make your research data discoverable, understandable, and reusable by yourself and others. Ideally, the documentation should include contextual information about the research and data creation processes, as well as data-level descriptions and annotations. Where appropriate, the documentation files should be kept with the data files to aid interpretation and understanding. In cases with anonymised data, the coding key should be kept separately to maintain participant confidentiality.

Data-level (embedded) documentation

Information about a file or dataset contained within the data or document itself. This covers descriptions and annotations that are embedded in a data file such as:

  • Field and label descriptions.
  • Explanation of codes or classification schemes.
  • Descriptive headers or summaries.

The UK Data Archive provides detailed examples of data-level documentation for structured tabular data and qualitative data.

Study-level (supporting) documentation

Separate files that accompany data in order to provide an overview of the research context and design, data collection methods, data preparation, and results or findings may include information on the context of data collection and methods used, structure of files, data sources used, or validation methods. This information is usually not embedded in a data file. Some examples of study-level documentation are:

  • Working papers or laboratory books.
  • Questionnaires designs or interview guides.
  • Final project reports and publications.

In some cases, the supporting documentation may require digitising before it can be included alongside the digital research datasets. Examples of digitisation processes include scanning a handwritten laboratory notebook or transcribing an audio recording.

Catalogue Metadata

Metadata for online catalogues are often structured according to international standards or schemes. Repositories or data centres use this metadata to facilitate identification and discovery of the data. This structured information captures details about the purpose, origin, creator, access conditions, and terms of use of a research dataset. Examples of metadata fields include:

  • Title.
  • Description.
  • Abstract.
  • Creator.
  • Geographic location.
  • Keywords.

The Digital Curation Centre maintains a comprehensive list of disciplinary metadata standards.


Back to top