Sharing sensitive data: key considerations and approaches for safer sharing
12 January, 2023 | Dr. Rebecca Grant |
|
|

Many publishers encourage authors to share their data as openly as possible to facilitate reuse and support greater transparency of research outcomes. For researchers who have worked with human research participants, however, the request to open their datasets may raise questions.
In November 2022, we hosted a webinar on sensitive research data sharing. In this session, Rebecca Grant (Head of Data and Software Publishing, F1000) discussed some key elements researchers need to consider to share sensitive data legally and ethically.
Watch our on-demand webinar below or keep reading for the session’s highlights.
What makes research data sensitive?
Generally speaking, sensitive research data refers to data that should not be shared in the public domain without additional consideration. This might include human data, data impacting on national security, or locations of rare or endangered species, for example.
Types of human data
Depending on the type of research, “human data” can take many forms. In clinical research, sensitive data includes blood samples, tissue samples, or genetic sequences. In the social sciences, sensitive data can include demographic information, images, videos, audio files, or qualitative data related to attitudes, opinions, or experiences. Plus, many researchers now incorporate datasets originating from social media sites into their research, such as Facebook profiles, tweets, or internet dating profiles, which also constitute human data.
Datasets from human research participants often include personally identifying information, which could allow others to identify research participants if the dataset was released openly. In addition, some research covers particularly sensitive topics, such as alcohol dependency or sexuality, which might lead to increased risk to participants if they were identifiable. Because of these concerns, it’s important that human data is handled appropriately and only shared openly if the ethical and legal implications have been considered.
Framing data sharing best practice: ethical and legal requirements
So, what do you need to consider before sharing sensitive human research data?
Firstly, it is important to be aware that legal and ethical frameworks may differ depending on where you are based and where the research is being conducted. You should ensure that you identify and comply with the relevant legislation in your region and discuss any ethical concerns with your Institutional Review Board before sharing sensitive data openly. While publishers and funders often have open data sharing policies for researchers, these will take into account sensitive human data and will not require you to share data when it is not appropriate to do so.
Local and regional data protection legislation is usually the basis for most legal aspects of human data management, storage, and sharing. For example, human data is covered by GDPR in the European Union, or HIPAA in the US. You should note that the legislation which applies to your research will usually be determined by where you are based, but if your research participants are in another region then their local legislation may also apply.
Additionally, when working with human research participants ethical frameworks for research and data sharing are equally important. You should consider the rights and dignity of your participants, ensure that they have given informed consent, and that they have understood how their data will be used and shared. Some research participants may prefer to be identified in a research study, or even find it empowering. You can give participants the option to be identified or have their data anonymized before sharing, as long as they have given their informed consent.
Both legal and ethical requirements are important when it comes to sharing sensitive data and one does not supersede the other. Researchers should take both into account when considering how their research data can be shared.
Approaches for safer sharing
Although researchers should take additional precautions when considering sharing sensitive data openly, there are a number of steps which can be taken to share data safely, ethically, and in compliance with relevant legislation.
#1: Consideration and consent
Before a study begins, even at the point of applying for grant funding, you will need to consider how any sensitive data you collect might be shared. It is important to do this in advance as you should ideally provide an overview of your data sharing plans to your Institutional Review Board as part of the ethical review process. It is also necessary to plan ahead so that when you are recruiting your research participants, you can describe the planned data sharing methodology to them and obtain their informed consent.
When approaching potential research participants to join your study, you should provide them with an explanation of what the study will involve, and how their data may be shared. Consent should be written or at least recorded (e.g. an audio file) and it must be given freely – you cannot provide a form with pre-ticked boxes for example. You must also allow participants to opt out of data sharing; or alternatively, they must be able to opt out of the study entirely if they do not agree to their data being shared.
#2: Anonymization
Anonymization removes identifying information from a dataset. It is a method of protecting participant privacy and reducing the likelihood of re-identification. Although anonymization removes the identifiable information from a dataset, you should only apply these processes if you have received informed consent from your participants to do so.
When considering anonymizing a dataset, you are likely to need to remove both direct and indirect identifiers. A direct identifier such as full name, date of birth, or address uniquely identifies a research participant. Indirect identifiers may uniquely identify a research participant in combination, for example ethnicity, gender, place of birth, and job title. Anonymization is an iterative task which will be specific to the dataset that you generated. You should continually reevaluate the dataset as you conduct the process to ensure that you have removed sufficient information without destroying the value of the data.
Key data anonymization techniques include:
- Removing variables that are not necessary for analysis or relevant to the research
- Generalization: making an information point less specific such as swapping an address for a city
- Pseudonymization: referring to a research participant without using their real information by swapping their names for falsified versions
- Creating bands (banding): taking specific information like age and putting it into a range
You may need to use a combination of these when anonymizing a single dataset. Similar techniques can be used for anonymizing qualitative data, and any change to the original dataset can be indicated using diacritics around the text, for example, # or @ symbols.

#3: Controlled access
There may be cases where data cannot be fully anonymized, or where the anonymization process would remove so much information that the dataset would no longer be useful. It is still possible to make such research data accessible whilst protecting participant privacy by using a controlled access data repository, as long as you have obtained participant consent to do so.
Controlled access data repositories provide a location, usually on the web, where researchers can store their data, but don’t publish it openly. Usually, a metadata record describing the data will be shared openly instead, and the repository will require that users meet certain conditions to ensure that only approved users can access the data. If you are publishing a paper based on sensitive data stored in a controlled access repository, you can use your Data Availability Statement to describe where the data is located and the conditions under which it can be accessed.
At F1000, we encourage our authors to make their data as open as possible, and as closed as necessary. We also recognize that openly sharing data may not always be feasible due to ethical considerations or other restrictions. That’s why we have policies in place to support the publication of papers associated with such data, whilst maintaining the appropriate level of security.
For more information on how to share sensitive research data, please refer to our quick guide.
You can also subscribe to our mailing list to receive our latest updates on upcoming webinars and new resources.
|
User comments must be in English, comprehensible and relevant to the post under discussion. We reserve the right to remove any comments that we consider to be inappropriate, offensive or otherwise in breach of the User Comment Terms and Conditions. Commenters must not use a comment for personal attacks.
Click here to post comment and indicate that you accept the Commenting Terms and Conditions.