Wellcome Open Research

Sharing sensitive data openly and safely 

Open data can help to improve reproducibility, transparency, and trust in research, playing a key role in open science. However, while it has lots of benefits, researchers working with sensitive or commercial information can face challenges with data sharing practices. In our blog, we explore how researchers can share sensitive data openly and safely.  

What is sensitive data?

Sensitive data can take many forms in different types of research.  

One of the main types of sensitive data is human data, which can include:  

  • Images, videos, audio files, or qualitative data related to attitudes, opinions, or experiences  
  • Clinical trial results  
  • Datasets from social media sites  
  • Personal identifying information, such as age, ethnicity, location, and sexuality  
  • Sensitive health status information, such as alcohol dependency  

Human data is usually found in health and social science research but can also apply to other research.  

These types of data need to be handed particularly carefully and only shared openly when sufficient safeguarding is in place, as they can allow individuals to be identified, especially when used in combination.  

Other types of sensitive data include:  

  • Intellectual property, such as new inventions  
  • National security data, such as classified information from governmental bodies  
  • Third-party data, such as proprietary commercial information  

This type of sensitive data can be found across all types of research.   

How can I make sure I share my sensitive data safely?  

There are several measures researchers can take to ensure they can share data safely and securely.  

Identify ethical and legal requirements for data  

Before being able to share any data, including sensitive data, you must understand all applicable legal and ethical requirements for your research.  

These differ depending on where you as an author are based, where your participants might be based, where your research is conducted, or where any third parties are based.  

For example, in Europe, GDPR legislation governs data protection, whilst the US has multiple different federal and state requirements, including the CCPA and Privacy Act. According to the UN, 137 countries around the world have some form of data protection legislation, so it’s important to identify relevant legislation for your work.  

In addition, ethical considerations must underpin any sharing of sensitive data, and you must ensure you consider the rights and dignity of individuals.  

Your institution, funder, or organization can usually provide guidance on ethical and legal considerations for both human and non-human sensitive data. For example, Wellcome’s funding guidance provides information on different ethical and IP issues.  

Create an Output Management Plan  

An Output Management Plan, or Data Management Plan, is a key tool to ensuring you share sensitive data safely. 

An OMP allows you to identify from the outset, before you even begin your research, the types of data you may collect, create, or reuse throughout the project.   

In turn, this can help you to identify what type of data might be sensitive, any measures you might need to put in place to ensure you can share data at the end of the project, and what data must not be shared at any point. 

Creating an Output Management Plan and using the information within it will help to inform how you carry out your research, and how you deal with data at each step of a project.  

It’s important to think about both human and non-human data in an OMP, including any commercial or third-party data that might be reused or developed.  

Wellcome provides lots of guidance for completing an OMP, as well as examples of successful OMPs.  

Gain consents

Whether you’re working with human participants or other third parties, it’s important to gain the appropriate consent.  

You need to clearly communicate what data you propose to share, how the data will be shared, the level of open access to be granted, and any limitations on this.  

All participants or third-party organizations must have the option to opt out of their data or IP being shared and, in the case of human data, the option to have data anonymised if they wish when sharing.  

These consents need to be recorded clearly in the Output Management Plan, ideally with evidence in writing, for the avoidance of doubt. It’s also important to note that once these consents are given, you can only share data in the exact way that has been consented.  

Read Wellcome’s policy on human participants and the policy on Intellectual Property. 

Anonymise human data

When working with human data, one of the safest ways to share data openly while maintaining confidentiality is anonymisation.  

This removes any identifying information from a dataset to reduce the likelihood of re-identification. However, this is not a replacement for consent and should only be done with data for which you have already received informed consent to share.  

Anonymisation can be done for direct identifiers, such as full name, date of birth and address, and indirect identifiers such as ethnicity, gender, sexuality, or place of birth.  

Key data anonymisation techniques include:  

  • Removing any variables that are not necessary for analysis or relevant to the research  
  • Making an information point less specific, such as swapping an address for a city  
  • Referring to a research participant without using their real information by using aliases  
  • Taking specific information like age and putting it into a banded range
Examples of anonymisation of quantitative and qualitative data.

Control access to data

In some cases, human data cannot be anonymised without losing its value, or you may be working with different sensitive data such as intellectual property or commercial data.  

In these cases, an alternative data sharing method is to use a controlled access data repository (again, only when consent is already granted).  

These allow researchers to store their data, but not publish it publicly. Instead, a metadata record such as a Data Availability Statement on Wellcome Open Research will be shared openly, which describes the data’s location and the conditions under which to access it.  

The repository will then require users to meet certain requirements to access the data, thus ensuring that data is shared only in a way that is fully controlled by researchers.  

Publish data-related information

Regardless of the measures taken by the authors, there are some cases where data cannot be shared openly. 

As a result, most publishers and funders will have exceptions to open data policies, including Wellcome Open Research.  

In these cases, authors can publish some of their data-related information instead, such as:  

  • Methods sections that provide a detailed description of how the study and subsequent data were created.  
  • Metadata, such as Data Availability Statements, providing a description of the final data, discussion of any variables assessed, and a data sharing disclaimer.  
  • Any intermediary data that can be shared without concern.  
  • Detailed information about where third-party data was sourced and how users can source it themselves.  

This helps other researchers to reproduce the research for themselves, even if the original data is unable to be shared.  

Next steps

If you’d like to find out more about sharing data, visit our data sharing guidelines.  And if you’re ready to join the 9,000+ of Wellcome-funded researchers already publishing their work with Wellcome Open Research, submit your research for publication today.  


COMMENTS