Wellcome Open Research

Open data FAQs

Open data can be confusing, but it doesn’t have to be! In this blog post, we’ll cover some of the frequently asked questions our team receives about Open Data to help you get your data ready for submission to Wellcome Open Research.

What is a data availability statement?

A data availability statement is a short section of text that tells the reader how, where, and under what conditions the data associated with your research can be accessed and reused.

Where can I deposit my data?

There are many options to choose from when sharing your data openly in a repository. There are institutional repositories, discipline-specific repositories, general data repositories, and controlled access repositories.

How can I ensure my data won’t be scooped?

A key benefit of open data is that it makes a dataset trackable through persistent identifiers, timestamps, and links to the authors. This clarifies which researcher established the idea and created the results first. When a researcher deposits their data in a public repository, they have established that they are the creator of the dataset and should receive credit for it.

What is a DOI or persistent identifier?

DOI is an acronym for digital object identifier (DOI), a type of persistent identifier. A DOI is a string of numbers, letters, and symbols to identify a unique research output and enable citation.

Persistent identifiers are important as they remain constant, even if the location of the digital research output moves. While a URL may change, a persistent identifier will carry across to the new location.

Will the data I submit be peer reviewed?

If your dataset is associated with a publication, your publisher may ask peer reviewers to review the data as part of their peer review process. For example, we ask peer reviewers for Wellcome Open Research to review the dataset as part of their assessment of the research.

What is metadata, and what information should it include?

Metadata is data about other data. It aids both discoverability and understanding of the data. Metadata can contain descriptive information about the dataset and administrative or structural information.

The content and format of metadata are often guided by a specific discipline and/or repository. Still, metadata records in a repository typically include information such as creation date, file format, data creator, keywords, location, how the data was generated, and version information, amongst other things.

Are there instances in which I shouldn’t share my data?

In some situations, sharing your data would not be legal or ethical. This includes when you don’t have ownership of the data, when sharing the data would conflict with the need to protect personal identities, or when data is commercially sensitive or could cause a security risk. Your institution’s research ethics committee can guide you if you’re unsure about sharing your specific dataset.

How can I protect the privacy of research participants?

If your research involves sensitive human data, you need to take extra steps to maintain the privacy of your research participants and share your data in an ethically and legally compliant way. First, anonymise the data. Anonymisation techniques include removing any identifying information, using pseudonyms, and generalising where possible.

In some cases, you may want to limit access to the data to specific parties who will treat the data carefully. Controlled access allows you to upload your data to a repository and keep the files private. You can share access with others if specific requirements are met.

We hope that your questions about Open Data have been answered. If you have any further questions, please get in touch with the Wellcome Open Research Publishing team, who will be happy to answer them, or check out F1000’s on-demand webinar on Open Data.


COMMENTS