Research Data Cycle


Research data management - viz. the handling of research data from its creation to storage and reuse - can be illustrated as a research data cycle.



In general the individual phases of research data management are not as sequential as illustrated in the research data cycle above. The exemplary representation of a project in a Gantt diagram shows the partial overlapping of the phases.  RD_Gantt_Dieagram



Research data management begins with systematic planning: which data is needed, collected, processed, and stored? Project funders generally require structured data management beginning with the application process. At the same time, funds for research data management can be requested.
For this purpose, data management plans (DMP) have been established in research data management. Tools like RDMO or the DMPTool support the creation of dynamic DMPs for funders and project management.

Researchers are responsible for the generation and collection of data, e.g. with sensor networks or simulations on high-performance computers, as well as the evaluation, e.g. via algorithms and software. The description of research data with the necessary metadata, including scientific, administrative, and IT metadata is essential. An electronic laboratory notebook (ELN) can support researchers in this process.

Research data should be stored in a suitable repository, ideally subject-specific or institutional repositories. The research data should be stored together with a good documentation in order to enable subsequent use, especially concerning third parties. An important criterion is the possibility of long-term archiving, e.g. via tape storage.
It is necessary to define access conditions for data reuse, i.e. access rights, usage rights, etc., which may also include the granting of patents and licenses. The assignment of Persistent Identifiers (PID) ensures that the generated data can be uniquely identified and referenced. For the widest possible dissemination, existing networks and professional communities should be integrated via existing and possibly certified infrastructure.
Good research data management enables other scientists to research and reuse the results. They do not have to recreate the data, but can build on the aggregated state of knowledge. For discovering data, both individual repositories and services such as re3data or DataCite are available. Again, researchers must respect the legal framework and good scientific practice.