Handling of Personal Data
Disclaimer: No legally binding information! Please always contact the legal department of your institution for specific legal questions and for legally binding information and a data protection officer of your institution for questions regarding data protection.
Further information on research data management and law
What is personal data?
According to the General Data Protection Regulation (GDPR), personal data “means any information relating to an identified or identifiable natural person [...]; an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person” (Art. 4 No. 1 GDPR).
If, for example, 20 natural persons are asked about their last vacation destination, this data is not personal data for the time being. However, if the name is also recorded, the vacation country also counts as personal data, since it can now be directly assigned to a specific natural person. In the case of indirect identification, the same rules apply. Indirect identifiers can also be used to uniquely identify individuals, e.g., in the case of a combination of occupation and company, insofar as this occupation is unique to the company (e.g., in the case of jobs with managerial functions). Here is a list of examples of indirect identifiers: first names, place names, street names, states, institutional/organizational affiliations (e.g., employer, school), occupational information, titles and educational degrees, age, time/calendar data, pictures, and voices.
“Processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation” (Art. 9 No. 1 GDPR) is considered processing of special categories of personal data and requires special precautions which may have an impact on the possibilities of publishing such data. If you are unsure, always involve a JLU data protection officer and, if necessary, the Ethics Committee (Faculty 06 / Faculty 11).
Data are personal if they can be clearly assigned directly or indirectly to a specific natural person. Depending on the type of personal data, however, it may require different levels of protection, which may have an impact on the possibilities for publishing this data.
What do I have to consider with personal data?
Before the survey
First consider whether personal data are absolutely necessary for your research project or whether you could carry out the project without them. If you plan to aggregate your data for later presentation because you want to perform a statistical analysis, you should already think about which of these data can be directly assigned to a natural person before collecting personal data so that you can store them separately and easily delete them later after data aggregation. If the research purpose requires the collection of personal data, these must be handled extremely sensitive in case of processing of special categories of personal data (see Art. 9 GDPR). Therefore, special regulations must be followed when handling them, i.e. when collecting, processing, evaluating, analyzing, publishing as well as archiving them. Under certain circumstances, they may also stand in the way of publication of the data. In principle, processing requires a legal basis in order to be lawful (see Art. 6 GDPR).
Decisive for the handling of personal data within the EU is the GDPR, on the federal level this is additionally regulated by the BDSG. For general further information on the topic of which data protection principles should be taken into account in research design and project planning, see here in German language. For further information related to the social and economic sciences, see also the Data Protection Guide by RatSWD.
If personal data is included in your research project, there are certain regulations that must be followed when it is collected. This ensures that you are legally allowed to evaluate the data you have collected.
If there is no public interest worthy of protection in the research, which outweighs the interests of the individual in accordance with the Federal Data Protection Act (§ 27 BDSG), an informed consent must be obtained from the data subject prior to collection. As part of this consent, the data subject must be informed fairly, transparently and comprehensively about the collection and processing of his or her data (information to be provided according to Art. 13 GDPR). This includes, among other things, the name and contact details of the person responsible, the purpose of the collection and the legal basis for the data processing, but also information about the storage location, who has access to the data and, if applicable, where the data is published. In addition, the data subject must be informed about the storage period of the data, his or her right of rectification and the lack of consequences of refusing or revoking consent, as well as his or her right to information about his or her personal data. An informed declaration of consent has the advantage that the purpose of the survey can be precisely shown to the respondent, but in the case of a strict purpose limitation, it has the disadvantage that this data may only be used for exactly this research question and only within the scope of this survey. Thus, the data cannot be reused by others, even if they could use it to answer other research questions. In addition, any planned publication should be explicitly stated in the consent form. Further information on informed consents can be found, for example, at VerbundFDB (German). Sample informed consent declarations in German and English can be found at Qualiservice in the download area under “Template forms”.
If the specific purpose of the research has not yet been precisely defined, or if you know in advance that the data may also be relevant to other research questions, you can also create a broad consent. Here you have two possible approaches: Either you already list in this consent your possible further settings for which the data will be used or you try to obtain a very general consent, for example, for your discipline or for all further research questions. Keep in mind, however, that the latter will probably only very rarely be signed by the respondent, despite the right of withdrawal. Furthermore, it is still legally unclear how a broad consent must be written in a legally secure way. In the report on the legal framework for research data management, which was produced as part of the DataJus project in 2018, it is recommended, for example, that a “broad consent” should be as specific as possible, i.e., that possible further research questions and follow-up projects should be listed directly and explicitly.
If the personal data are special categories according to Art. 9 GDPR, they must be listed separately and explicitly in the declaration of consent. This includes, for example, ethnic and biometric data as well as information on political opinions.
The data subject also has the right to ask about his personal data at any time and also to request the deletion of the data as well as to object to processing (Right of Access according to Art. 15 GDPR). In particular, the name of the contact person, recipients of the data, storage period, processing purpose and categories of personal data must be provided to him or her as part of the Right of Access. Should a data subject make use of his right of rectification, all evaluations made up to the time of the rectification may continue to be used, only no further evaluations may be made with the data of the data subject.
For the researcher, personal data also entails the obligation of the smallest possible amount of data (= principle of data minimization and economy), the shortest possible storage time, as well as the best possible protection of data against loss and misuse (according to Art. 5 GDPR). Due to the high complexity of data protection aspects when dealing with personal data, it makes sense in any case for researchers to seek legal advice in advance.
If you process personal data, the GDPR requires you to keep a Record of Processing Activities (see Art. 30 GDPR and Recital 82 GDPR). Further information on this topic is provided here (German) by the Hessian Commissioner for Data Protection and Freedom of Information. If an external provider is commissioned to conduct, for example, a survey or interview a contract processing agreement must be drawn up in accordance with the GDPR. A sample contract can also be found here (German) at the Hessian Commissioner for Data Protection and Freedom of Information.
“The result of processing for statistical purposes is not personal data, but aggregate data” (Recital 162 GDPR). These cannot be traced back to individual natural persons, even if personal data of any protection need were included in the original survey.
Regardless of whether aggregation can be performed or is planned, pseudonymization must be performed as early as possible - e.g., as early as the creation of transcripts. If, for example, each respondent is assigned a kind of "identity number," his or her information can be linked to that number rather than to his or her clear name. Pseudonymization thus means that the personal data can't be processed without adding additional information, such as a key list with the mapping clear name<->identification number because information can no longer be assigned to a specific data subject without this additional information. This goes hand in hand with the fact that this additional information must be stored separately and must also be subject to technical and organizational measures so that it is no longer possible for unauthorized persons to perform an assignment (in accordance with Art. 4 No. 5 GDPR). Pseudonomyization thus protects the assignment through data separation, but preserves it.
If you want to archive your personal data and/or make them available to the public for subsequent use via a repository in form of a publication, it is important to comply with data protection regulations and to protect the personal rights of the persons subject to the study. The openness of access to the data depends on the degree to which the data require protection.
On the one hand, there is the possibility of publication via data centers, to which data can be handed over without anonymization if necessary and which instead use access restrictions depending on the need for protection. This makes sense because many data sets can lose enormous value through anonymization. Examples for those data centers are Qualiservice and GESIS for social science research data or VerbundFDB, which handles the transfer to data centers of empirical educational research data for you. Here you will also find a list of all research data centers accredited by the RatSWD.
However, if you want to publish your data Open Access in a research data repository (e.g. the institutional research data repository of JLU Gießen called JLUdata or the generic repository Zenodo), your research data must be anonymized. Anonymized data is “information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.” It further states, “To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments” (Recital 26 GDPR).
If you want to publish personal data in JLUdata or another repository, these data must be anonymized and the use after the end of the research project must not be excluded in the consent. This is because even anonymous data may not be published if the consent explicitly excludes subsequent use after the end of the project.
Anonymization techniques distinguish between qualitative and quantitative data. VerbundFDB provides anonymization instructions for both qualitative and quantitative data, which you can find here (German). In addition, OpenAIRE offers a tool called Amnesia to help you anonymize your data. The Data Protection Foundation (Stiftung Datenschutz) published a Practice Guide to Anonymising Personal Data at the end of 2022, which you can find here.
According to Art. 5 No. 1e DSGVO, it is permitted for research purposes to store personal data for specific purposes even beyond the period of processing, while adhering to security standards to protect the data from misuse.
As a first step the decision tree (German) developed in the DataJus project on data protection issues arising for the publication of research data can help you with legal questions. In addition, a more detailed overview of data protection in research data management can be found in the article by Watteler/Ebel (2019) (German). Leuphana University Lüneburg (German) also offers an overview of questions about legal aspects surrounding research issues, including answers about handling personal data.
Further information on research data management and law
If you are looking for more information on how to handle legally sensitive data in the context of research data management, the free ILIAS self-study unit on research data management provides a good initial overview. At the moment this self-study unit is only available in German. However, a translation is in the making.