Skip to Main Content

Research Data Management

What kind of data?

Please notice that the guide is being updated during spring 2023.

Briefly describe your research data. Explain what kinds of data you are collecting or producing. Outline how the data will be collected: e.g. via surveys, interviews, laboratory experiments or observations. Moreover, explain what kinds of existing data you will reuse.

Briefly describe what types of data will be used and are expected to be produced: e.g. texts, images, photos, video or audio recordings, statistics, measurements, physical samples.

In data management planning you should differ three categories of data for your research:

  1. Data collected or created by yourself.
  2. Data collected or created in some other research project.
  3. Documents and other material from archives, libraries and museums.

The difference between those three categories in data management planning is, that you are responsible of the documentation and storage of the data for potential reuse only if you have collected or created the data by yourself (category 1). If the data is collected or created by some other research project (category 2) or if you use material from an archive, library or museum (category 3), the researcher, who originally collected or created the data, or the archive, library or museum is responsible of the documentation and storage of the data for potential reuse. Anyway, you have to plan how to document and store any data you are going to use during your research.

File formats

Ensure long term readability and access

Attention should be focused on the format in which data is stored to ensure usability after extended periods of time. Recommended file formats are those that are used extensively in the scientific community and supported by a variety of software.

  • Choose file formats early in the research cycle. There is not a single list of file formats that are appropriate for all use cases.
  • Make sure that the format suits it's purpose (e.g. data creation, data analysis, using software programs, files relationships, suitability for conversion, lossy or not, open or not, sharing of data, long-term sustainability, discipline-specific standards, interoperable between software programs, data interchangeability and transformation, backups)
  • Avoid proprietary formats that are owned by a company. The widely used and popular proprietary formats are most likely - but not without a doubt - to have a long-term sustainability (e.g.MS Word, Rich Text Format, MS Excel, SPPS format)
  • To guarantee long-term data preservation convert data to standard or open formats (E.g. PDF/A, CSV, TIFF, ODF, ASCII, XML). Conversion will cause some data loss, so keep a copy in the original software format.

Data quality

Quality control of data is an integral part of all research and takes place at various stages, during data collection, data entry or digitisation, and data checking. It is vital to assign clear roles and responsibilities and to develop suitable procedures before data gathering starts. During data collection researchers must ensure that the data recorded reflect the actual facts, responses, observations and events. The quality of data collection methods used strongly influences data quality, and documenting in detail how data are collected provides evidence of such quality.

  • Data collection, analysis and processing methods may affect the quality of data
  • Ensure that no data is accidentally changed
  • Ensure that the accuracy of data is maintained over its entire life cycle
  • Quality problems can emerge
    • Due to technical handling
    • Due to converting or transferring of data
    • During the contextual processing and analysis
  • Ensure the original information content during conversions and transfers
  • Never destroy original data!