Benefits of data management

Research data management refers to how you handle, organise, and structure your research data throughout the research process. Data management:

  • Begins with your initial considerations regarding what will be necessary for using or collecting your particular type of data;
  • Includes measures for maintaining the integrity of the data, making sure that they are not lost due to technical mishaps, and that the right people can access the data at the appropriate time;
  • Looks forward to the future, making it clear that you should provide detailed and structured documentation to be able to share your data with other colleagues and prepare them for long-term availability.
20171116_BenefitsDMv2_Tekengebied 1

To make your research as time-efficient, reproducible and safe as possible, it is important that your data management is well thought through, structured, and documented. A good data management strategy takes into account technical, organisational, structural, legal, ethical and sustainability aspects. The time invested in setting up a good data management strategy pays off when the time comes to reproduce your analysis and results. You will be able to easily find and understand your data, increase your data's reuse potential and comply with funder mandates at the same time.

Data Management Plan

Data Management Plans (DMPs) are a key element of good data management | European Commission, 2016.

Information regarding your data management needs to be easily found and understood, not least if you are working on a project that runs over several years and involves a large team of people. In order to simplify data management, a Data Management Plan (DMP) can be created early in the research process. A DMP is a formal document that provides a framework for how to handle the data material during and after the research project. The way a DMP will look once it is finished is not universal. It is a "living" document that changes together with the needs of a project and its participants. It is updated throughout the project to make sure that it tracks such changes over time and that it reflects the current state of your project.

A lot of diversity exists in DMPs because they are always built around the particular needs of the data collected within your project. Sometimes there are particular requirements from stakeholders that have to be answered in the DMP from stakeholders such as:

Funders may require a Data Management Plan (sometimes called Data Publication Plan (DPP)) to get information on what data you intend to collect and whether (and how) you will make those data accessible to others. In this case you provide the funding agency with whatever information they require, to the extent that they specify. Depending on the nature of the call, such plans may include not only details on the kind and volume of data to be produced but also how the datasets will be documented and shared (along with other research outputs of the project, such as publications, program code, and educational resources). They may specify the length of the DMP or may expect you to include it in the page count of the scientific plan.

A DMP written for the funder is not always the same type of comprehensive DMP which is described in the list of questions to this guide (CESSDA, 2018a). However, the list can be used as a support when writing the DMP/DPP that the funder(s) require(s). Click here for the editable version (CESSDA, 2018b). Note that some funders might require that an updated DMP/DPP to be submitted as a deliverable within a specific time period. See 'Diversity in funder requirements' for more information.

Your institution may have its own policy regarding data management, including what information should be gathered and archived together with research data and publications. It’s possible that your institution can support you with writing a DMP, e.g. by providing expertise or (referring to) safe storage services.

The added value of a Data Management Plan

Several researchers who I have been talking to and have looked at the Data Management Planning checklist of the Swedish National Data Service (SND) have said that doing so made them start thinking of data security, data ownership, file formats etc. before the start of their project. By doing so they avoided some possible problems that would otherwise appear later on | Ulf Jakobson, Data manager humanities, SND.

A Data Management Plan (DMP) offers added value in the following ways:

20171119_BenefitsDMP_Tekengebied 1

Taking the time to plan ahead can save you a great deal of headache once the project is up and running.

Overall, a DMP helps you plan for the resources, tools, and expertise that are required to store, handle, and manage the given types and volumes of data that are expected to be collected. A DMP serves as a tool to pay careful attention to all aspects of data management. It makes you aware of possible problems at an early stage so that you can work around them. E.g. it reminds you to gain consent for future reuse and sharing from research participants.

By thinking early about various aspects of data management, you can ensure that the material is well-managed already during the data collection period. Structured and well-documented data enable others to understand the materials more easily. This, in turn, facilitates the preparation of the material for archiving, and enables further research after the end of the project.

An important function of a DMP is to work as a one-stop shop to find project-related information. Research becomes so much easier if all of your questions surrounding managing your data are being gathered in one place and project-related details are readily available rather than just vaguely remembered or simply forgotten.

A DMP is an efficient way for the researcher and his/her team to gain control over research data collection and management when the research project is up and running. Regardless of the size of the team there will be a need for easily found data-related information regarding file locations, naming conventions, standards, project description, project roles, backup regimes, versioning and so on. By writing a DMP, the researcher can ensure that the material is well-managed during the research period, which also facilitates the preparation of the material for archiving, and thus enables further research after the research project has ended. Also, it is usually easier to document research material if this is done in close proximity to the steps in the research process that create or change the material.

Project management becomes easier if you also include administrative information such as the names and ORCIDs of the Principal Investigator(s) and project members, information on which institution owns the data, registration numbers for funding and ethics board approvals. Furthermore, a lot of relevant information is kept in log books, code lists, technical reports and other documents. These documents can be referred to in the DMP together with their location information. Keeping all relevant information regarding your project in one place makes future reference a lot easier, whether that future reference is for your own thesis in three years, for an audit in five years or a historical study in fifty years.

Data management is not free. You do not want to find yourself running out of funding before the end of the project because you have ignored or underestimated the cost of structured, detailed, and safe data management. Therefore, an important aspect of a DMP is its use in calculating how much money will be required for managing your research data during your research project.

A DMP can be useful in the process of applying for funding. Grant applications should not only include time and resources for collecting, analysing, and publishing on data in their budget, time and resources for careful documentation as well as server space, backup solutions, and documentation software need to be included as well. A DMP is also useful once funding is granted to plan and manage your expenses. Many research funders require a DMP as part of the application and decision-making process. The arguments for making data available are several, the most popular being that the data produced by public funds should be used to the greatest extent possible and available to the public. Unless there are legal, ethical or commercial barriers, data should also be openly available so that research results can be verified, replicated and reused.

Examples of Data Management cost assessments are given by the University of Utrecht (n.d.) and the Dutch Landelijk Coördinatiepunt Research Data Management (LCRDM, 2020) inspired by the 'Data management costing tool' by UK Data Service, 2013.

A DMP allows you to think through beforehand how to provide a dataset to a data repository which is as FAIR as possible. A DMP:

  • Makes structuring and documenting of your datasets simpler, thus making it easier for others as well as your future self to find and understand the material;
  • Encourages you to think about the data format which is best suited for reuse;
  • Allows you to think about the reuse license you would want to apply to your data;
  • Etc.

If you draw up a DMP, you are showing your affiliated institution, funders and project partners a serious approach to research data management, that includes a responsible approach towards research funds and research participants.