Skip to main content

Research Data Management

Documentation

Document your data at the very beginning of your research project!

  • make a note of all file names and formats associated with the project, how the data is organized, how the data was generated (including any equipment or software used), and information about how the data has been altered or processed.
  • include an explanation of codes, abbreviations, or variables used in the data or in the file naming structure.
  • keep notes about where you got the data so that you and others can find it.
  • keep notes about every choice and decision, that you have made: the reasons for leaving something away, the reasons for choosing certain classifications, the sampling principles, the ways you got your data subjects etc.

Careful documentation on the collecting or creating and processing of data is important to the quality of the data. The documentation is crucial to the long-term storage of your data. Good documentation - good metadata - is necessary for other researchers to find, understand, use and properly cite your data.

Following are some general guidelines for aspects of your project and data that you should document, regardless of your discipline.   At minimum, store this documentation in a readme.txt file or the equivalent, together with the data.

  • TITLE: Name of the dataset or research project that produced it
  • CREATOR: Names and addresses of the organization or people who created the data
  • IDENTIFIER: Number used to identify the data, even if it is just an internal project reference number
  • DATES: Key dates associated with the data, including project start and end date, data modification data release date, and time period covered by the data
  • SUBJECT: Keywords or phrases describing the subject or content of the data
  • FUNDERS: Organizations or agencies who funded the research
  • RIGHTS: Any known intellectual property rights held for the data
  • LANGUAGE: Language(s) of the intellectual content of the resource, when applicable
  • LOCATION: Where the data relates to a physical location, record information about its spatial coverage
  • METHODOLOGY: How the data was generated, including equipment or software used, experimental protocol, other things you might include in a field notebook

File naming

You should plan, how to give names to your files in the beginning of the project. The plan has to exact enough and cover all the needs during the whole project. The goals for planning are

  • to be able to see, what kind of information the files contain (human readable filenames)
  • to be able to sort and find your files by computer (computer readable filenames)
  • to be able to keep up all data in logical order

Good instructions for naming:

Data quality

Quality control of data is an integral part of all research and takes place at various stages, during data collection, data entry or digitisation, and data checking. It is vital to assign clear roles and responsibilities and to develop suitable procedures before data gathering starts. During data collection researchers must ensure that the data recorded reflect the actual facts, responses, observations and events. The quality of data collection methods used strongly influences data quality, and documenting in detail how data are collected provides evidence of such quality.

  • Data collection, analysis and processing methods may affect the quality of data
  • Ensure that no data is accidentally changed
  • Ensure that the accuracy of data is maintained over its entire life cycle
  • Quality problems can emerge
    • Due to technical handling
    • Due to converting or transferring of data
    • During the contextual processing and analysis
  • Ensure the original information content during conversions and transfers
  • Never destroy original data!