It is expected that a Data Publication will ensure that data will potentially be considered as a first-class research output | Knowledge Exchange (2013).
For a dataset to “count” as a publication, it should follow a similar publication process to an article (Brase et al., 2009) and should be:
Properly documented with metadata;
Reviewed for quality;
Searchable and discoverable in catalogues (or databases);
Citable in publications.
The authors of a report from Knowledge Exchange (Knowlegde Exchange, 2013) define this type of data publication as 'Publishing with a capital P' and compare it with 'publishing with a small p, meaning that researchers publish their data files on a website somewhere. Publishing with a small "p" means that there are no guarantees that the data will be there after some time or that the files will not get corrupted.
There are different ways to publish your data. Your preference may depend on the existing practices in your discipline or on the expectations of your funder.
According to a survey by Wiley (2014), the preferred way of publishing data is as supplementary material of a journal article. That may change as more data repositories become available, and more scientific journals recommend depositing in them. A data repository is a digital archive collecting, preserving and displaying datasets, related documentation, and metadata (OpenAIRE, 2017)
In the comparison table below we show five ways of publishing your data, together with their advantages and disadvantages.
Offers specialist domain knowledge and data management expertise, e.g. to create a catalogue record and documentation;
More likely to accept complete datasets;
Provides preservation and curation to community standards, e.g. file formats migration;
Ability to control access of (sensitive) personal data;
May handle data re-use queries;
May make your data visible via dissemination and promotion.
Most likely to be selective about what kind of data they accept;
May charge for data publishing;
Requires advance planning of the effort needed to meet high standards for metadata and documentation.
Choosing a data repository
There are hundreds of repositories worldwide. Some cater a specific research domain, while others are general-purpose repositories. They may be called something other than a repository, for example, a data centre or an archive | Whyte (2015).
If you decide to choose a data repository for publishing your data, which data repository should you choose? Sometimes the repository is already determined by your funder or another external party. But if the choice is yours to make, you may consider following the order of preference in the recommendations by OpenAIRE (2016b):
Timing is everything! In data archiving and publishing timing is everything. If you archive or publish your data as soon as data collection ends, your knowledge about your data is still very high. As such, it will take you the least time to prepare your data for deposit while simultaneously guaranteeing the highest possible data quality for future users.
Publish a data paper For high-quality datasets consider publishing a data paper in a data journal. This way, you can describe your datasets in more detail, which will increase their visibility and chances of being re-used. The data journal does not hold the datasets (they are in a data repository). See 'Promoting your data' for more information on this route.
Choose between self-archiving and expert help There is a difference between self-archiving without any help and archiving with the help of an expert. While self-archiving is a quick and easy way to publish data, archiving with the help of an expert will enhance data quality. Expert help is most likely to be available at a trusted domain repository and an institutional repository. Check to see whether that is the case.