Processing personal data

Since 25 May 2018, the General Data Protection Regulation (GDPR, European Union, 2016a) applies to any EU researcher or researcher in the European Economic Area (EEA) who collects personal data and any researcher worldwide who collects personal data on EU citizens. The GDPR applies only to the data of living persons. Data which do not count as personal data do not fall under data protection legislation, though there may still be ethical reasons for protecting this information.
The GPDR (General Data Protection Regulation, Chapter 2, Article 5) prescribes that you should adhere to the following six principles when processing personal data:

The research exemption

The GDPR contains an exemption which entails that some of the principles above are slightly different when you collect and process personal data for research purposes. This is called the 'research exemption'.

Processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes, shall be subjected to appropriate safeguards, in accordance with this Regulation, for the rights and freedoms of the data subject. Those safeguards shall ensure that technical and organisational measures are in place in particular in order to ensure respect for the principle of data minimisation. Those measures may include pseudonymisation provided that those purposes can be fulfilled in that manner. Where those purposes can be fulfilled by further processing which does not permit or no longer permits the identification of data subjects, those purposes shall be fulfilled in that manner | General Data Protection Regulation, Article 89.


In practice, this means that Principle II. and V. are less strict. Further processing of personal data for the purposes of archiving, scientific or historical research purposes and statistical purposes is not considered to be incompatible with the initial purposes of data collection, even when this purpose was not expressly mentioned earlier.

Also, personal data may be stored for longer periods for such purposes. In all cases, appropriate technical and organisational measures should be taken to safeguard the rights and freedoms of the participants in your research, such as data minimisation and pseudonymisation.

Personal data can only be processed when there is a valid legal basis to do so. The GDPR recognises six bases (grounds):

  • consent of the data subject
  • necessary for the performance of a contract
  • legal obligation placed upon the data controller
  • necessary to protect the vital interests of the data subject
  • carried out in the public interest or in the exercise of official authority (public task)
  • legitimate interest pursued by the data controller

In research, the three most applicable bases for processing personal data are consent, public interest (public task) or legitimate interest. For each research project, if personal data will be collected and processed, the most appropriate legal basis needs to be decided and recorded (and should not be changed at a later date). The UK Data Service has published examples of where a legal basis may be applied in research.

When you start a research project that involves collecting information from people, for example via a survey or interviews, then consider whether or not you willcollect personal data.If not, then data protection legislation does not apply. If you will collect personal data, then:

  • determine who will be the data controller (possibly your institution)
  • decide which legal basis will apply
  • if collaborative partners need access to personal data, then make sure agreements are in place
  • consider whether a Data Protection Impact Assessment is needed (see details on this in the GDPR Questions and Answers below)
  • communicate to research participants how personal data collected about them will be used, stored, processed, transferred, who the data controller is (with their contact details), the legal ground and purpose of the processing, the period of retention and their rights; this can be done via an information sheet or a webpage (e.g. privacy notice)
  • consider where to store personal data securely
  • minimise the personal data to collect and pseudonymise where possible

Here are some questions and answers about how to implement the GDPR requirements in practice in a research project, resulting from 2019 CESSDA Webinar

Q: I am a postdoc researcher doing a qualitative study, interviewing women about abusive relationships. I will use pseudonyms for each woman interviewed. Respondents may still be identifiable from the story they tell. Does this constitute personal information? If so, which legal ground should I use for this research?

A: Yes, this would constitute personal information. In this case the legal ground could be consent, which should be sought from the women participating in the study. Another aspect to keep in mind here is data collected which would allow identification of other people who may not have been asked for consent, for example partners carrying out the abuse. So you may also be processing personal data from people who have not been asked for consent. In that case, the processing ground could be public interest and the argument would be that the research has value for society. If the project allows, such partners could be made aware of the processing of their data, if this poses no risks to the participating women.

Q: I am doing an online poll survey, using Qualtrics, asking 5000 people across Europe for which political party they voted in the recent European elections, also recording their ethnicity and other demographic information. Does this qualify as processing special categories data? If so, how do I gain explicit consent for collecting this information?

A: A first consideration would be how much identifying/personal information is collected during the survey, alongside the political view and ethnicity. This helps to decide whether this classifies as special categories data. If no data is collected that allows identification of the respondents, then the GDPR will not apply. If identifying information is collected, then this qualifies as special categories data and therefore explicit consent would be needed. One way to achieve this would be through double consent, whereby consent for processing personal data collected would be asked at the beginning and the end of the questionnaire.

Qualtrics is a USA based company and thanks to negotiations by various European survey institutions, Qualtrics now only processes collected survey data in the EU for EU-based surveys. This means that Qualtrics can be used as a tool for surveys that need to comply with the GDPR.

Q: What are the GDPR rules when using administrative or register data that contain personal information?

A: If consent is not collected from the individuals when the administrative or register data are collected, then the most common legal basis for further use is public task. If consent can be sought, that would be preferable.

Q: The GDPR indicates strongly that a consent form should be easy and clear, yet I have to provide so much extra information to my interviewees now. How do I do this?

A: The best way to provide this information to participants is through an information leaflet and a consent form. You can provide the information in a written leaflet. If you are interviewing people you can explain the leaflet content also face-to-face to make sure it is people understand the content.

Q: If a researcher brings an electronic device across the border to a third country, sends an email or publishes personal data on the web, does this constitute as a data transfer?

A: An email containing personal data sent from Europe to someone in a non-European country would indeed constitute a data transfer. An electronic device containing personal data carried across the border to a third country would constitute a data transfer if the personal data will be passed on to another person. If personal data are published on the web, it depends on whether the data are stored and who can access them. If it is openly published it could be considered a data transfer.

Q: What are the data protection implications for international partnerships and research projects when non-EU countries are involved?

A: If personal data are going to be handled/processed as part of the partnership research activities within the EU, then the GDPR would apply. One solution would be that the European-based partners require their non-EU partners to have appropriate privacy/data protection measures in place and that consent is given by all subjects, irrespective of whether they are based in Europe or not. That may not always be easy or possible. However, solutions can be found such as data anonymisation, data encryption, using secure servers and partners can learn from each other. Good practice is also for all users and purposes of use of the personal data to be recorded.

Q: Does the GDPR apply to personal data, collected outside the European Economic Area (EEA) and transferred to the EEA for analysis?

A: Yes, it would, because it would be classified as personal data once stored within the EEA.

Q: Are there examples of research where using consent as legal basis for processing personal data would not be suitable?

A: Covert research is an example where consent would not be an appropriate processing ground, as asking for consent would have a negative outcome for the research. In covert research, public task would likely be the best ground. It is still important that the research adheres to ethical principles, and the researcher is open about the process used in publications.

Q: How can we comply with the GDPR when studying populations that are easily identified, for example surveys of candidates running in a general election or surveys of the members of a scientific association?

A: First, you need a legal basis for the processing of personal data. The most common legal basis for this scenario may be consent. If you gain consent from the people studied you can give information about the risk of being identified in published outcomes of the survey and ask consent on that basis. If the legal basis for processing personal data is public task, you should give information about the study to the population to make sure that they can manage their rights according to the GDPR.

Q: How is the ‘right to be forgotten’ applied in research settings?

A: The right to be forgotten applies in research, but is not an absolute right. Best practice is to inform participants about this right as clearly as possible and explain what it means and what it may not mean. For example, if data have been published in which people are identifiable, for example a paper containing a quote for which permission was given. Then if a participant wants to be forgotten, it would be very difficult to retract the paper. So be clear to participants about what they can do with this right and up to which point they can withdraw from research and request to be forgotten.


Q: Is a Data Protection Impact Assessment (DPIA) only required in scientific research for sensitive data concerning vulnerable subjects?

A: A DPIA is required for data processing that is likely to result in a high risk to the rights and freedoms of individuals. In practice this means if at least two of these criteria apply (examples can be found in the Data Protection Working Party 248 guidelines):

  • evaluation or scoring
  • automated-decision making with legal or similar significant effect
  • systematic monitoring
  • sensitive data
  • data processed on a large scale
  • datasets that have been matched or combined
  • data concerning vulnerable data subjects
  • innovative use or applying technological or organisational solutions
  • data transfer across borders outside the European Union
  • when the processing in itself prevents data subjects from exercising a right or using a service or a contract.

At the same time, a DPIA is a good learning tool. For a research project that involves the collection of personal data, a joint session of the researcher with a legal person and a technical person is very useful to establish best practices for data protection. This helps to understand context and helps to define common problems, solutions and risk mitigation measures.


Q: How are Data Protection Impact Assessments (DPIAs) being implemented across different institutions, for research?

A: If research is done as a collaboration of more than one institution, with shared responsibilities, one DPIA done by one of the institutions should be enough, and the other partner institutions should apply that same DPIA. Problems might arise when research involves institutions that are implementing a DPIA in different countries, whereby policies or requirements may vary across those countries, such as for data security, ownership of the data, different understandings on gaining consent and which legal basis to use for processing personal data.

Q: How should researchers deal properly with the GDPR in the context of open data?

A: For personal data, the open access motto “as open as possible, as closed as necessary” is important. A political or societal drive for open access and open science does not mean that individual rights granted by legislation can be overruled. Therefore, for personal data, ‘as closed as necessary’ is the key.


Q: What is the applicability of ‘legitimate interests’ in research using Artificial Intelligence (AI)?

A: The use of AI is a specific form of using personal data, and legitimate interest could be a legal basis for AI. More important is the framework provided by guidelines and recommendations of the High-Level Interest Group on AI: Ethics guidelines for trustworthy AI and Policy and investment regulation for trustworthy AI.


Q: When a US entity is a processor of pseudonymised EU citizen data and the key to re-identify the subjects exists only in the EU, so that the US entity cannot re-identify the subjects, does the GDPR apply to the US entity? Is the US entity required to sign a contract if requested by the EU entity?

A: If the US entity has no access to the key, then the data would in theory be classified as anonymous data. If the key would ever be released or the US entity would gain access to it, then the data would be defined as pseudonymised data or personal data. The organisation would need to decide whether signing a processor agreement would be best, considering the risks they wish to take.


Q: In research projects that plan to use data collected from social media platforms, how can researchers reconcile the right to privacy vs. the publicly available data?

A: Gaining consent would be the best approach when using social media data. So even for social media data in the public domain, researchers should ask the people whose social media content they mine for their consent when possible. In some cases public task could be used as legal basis.


Q: Are European countries converging or diverging in their choice of legal basis for processing personal data in research across Europe, specifically when considering whether consent or public task would be used in research?

A: The UK strongly encourages the use of public task as legal ground in research, whereas many other European countries favour consent. The UK view may pose a risk for participants’ rights. We will be able to evaluate in future how this has evolved. For the German case one can rather see a diverging trend since the Federal government left things open to be defined by the 16 Federal States. They took the chance and eight have now introduced the definition of “anonymous data” as formerly used in the German Data Protection Act. But they all see consent as a major basis for research.


Q: Should data repositories and data archives be considered as data processors or data controllers? Is archiving research data from a project part of the original processing for the research, or does it constitute a separate, further processing?

A: In most instances it is likely that data archives would be considered as data processors. However, some data archives may also be involved in undertaking research for the projects, which could lead to them being a joint controller with the research institute.

Different data archives in different countries may take a different view. Some would consider all data collections as potentially personal data and treat them as such. Only when data are considered to be fully anonymous would the GDPR no longer apply. Other archives take a two-tiered approach having certain procedures for anonymous data and other for personal data. An archive can archive personal data if there is a legal basis to do so. Liaising with the research project team is important.