Table of Contents
Licensing your data
If you publish your data in a data repository of your choice, a licence agreement will be applied to your data. A licence agreement is a legal arrangement between the creator/depositor of the data set and the data repository, signifying what a user is allowed to do with the data. Stating clear re-use rights is like having a warm 'Welcome' on the doormat of your dataset. It is an important aspect of making sure your data meet the R (Reusable) in FAIR data management.
To make re-use as likely as possible we advise you to choose a licence which:
- Makes data available to the widest audience possible;
- Makes the widest range of uses possible.
About Creative Commons licences
The main attributes of using Creative Commons (2017) licences for the licensing of data, datasets, and databases (Korn and Oppenheim, 2011) are:
- The ease of use of the licences;
- The widespread adoption of the licences;
- Their flexibility;
- Their availability in human-readable and machine-readable forms allowing both researchers and computers to immediately know what they are allowed to do with your data;
- The chance that your data are reused.
There are 7 licences for which the details are given in the table below (inspired by Foter, 2015):
Licence |
Can I copy & redistribute the work? |
Is it required to attribute the author? |
Can I use the work commercially? |
Am I allowed to adapt the work? |
Can I change the licence when redistributing? |
CC0 |
Y |
N |
Y |
Y |
Y |
CC BY |
Y |
Y |
Y |
Y |
Y |
CC BY-SA |
Y |
Y |
Y |
Y |
N |
CC BY-ND |
Y |
Y |
Y |
N |
Y |
CC BY-NC |
Y |
Y |
N |
Y |
Y |
CC BY-NC-SA |
Y |
Y |
N |
Y |
N |
CC BY-NC-ND |
Y |
Y |
N |
N |
Y |
Do note that a CC licence cannot be revoked once it has been issued.
The licence you are allowed to apply may be determined or limited by the data repository of your choice. In the accordion below an example is given.
Considerations in choosing a licence
If you only consider your own benefit, you might choose a licence for which attribution is required. What you may not realise is that when such data is blended with similarly licensed data this may lead to impracticalities of required attribution (Dodds, 2014) whenever the data is reused. To facilitate the release of datasets and databases into the public domain, Creative Commons created the CC0 licence.
CC0 is the only truly open Creative Commons licence. The copyright owner waives all its rights, including the database right and the right to be identified as the creator.
Although CC0 can be used to prevent attribution stacking, attribution can be important as a means of recognising both the source and the authority of the data. To acknowledge this right, the use of CC0 can include the publishing of non-binding suggestions for best practices in attribution.
There will be circumstances in which CC0 is inappropriate, due to specific risks that might arise for the licensor and perhaps subsequently also for any users. E.g. when:
- Datasets containing (sensitive) personal information are deposited for which consent has not been cleared (see the chapter on protecting data);
- Permission of the copyright holder has not been sought;
- The rights holders are unknown or cannot be traced (orphan works).
In these cases, licences that place ‘some’ restrictions upon the user, such as those with an “ND” (No derivatives) and/or “NC” (Non-Commercial) might be more appropriate.
Tips for choosing a licence
- 1. Be sure who owns the data
Remember you can only archive and publish data you own (or if you have permission).
2. Use the licence selector
Choose an appropriate licence for your datasets with this licence selector (n.d.).