Supporting global malaria control with new genomic datasets
| 19 April, 2023 | Dr Richard Pearson |
In this blog post, Wellcome Open Research speaks to Dr. Richard Pearson (Wellcome Sanger Institute and MalariaGEN) about a recent Research Article published on the platform, which introduces an updated dataset of over 20,000 Plasmodium falciparum malaria parasite genomes.
First, let’s meet the author
Dr Richard Pearson is the corresponding author of the Research Article, which was a global collaboration of malaria researchers. Richard is the Parasite Data lead in the Genomic Surveillance Unit at the Wellcome Sanger Institute, with a focus on producing and analysing whole genome sequencing data from malaria parasite field samples and has worked on MalariaGEN data releases for over 10 years.
Can you provide a brief introduction to MalariaGEN, the genomic epidemiology network?
MalariaGEN is a data-sharing network of partners in 40+ countries who build and share large-scale malaria parasite and mosquito data resources. We study malaria epidemiology and evolution to create accessible resources for malaria control. These datasets facilitate the search for new drugs, insecticides, and vaccines. They also contribute to genomic surveillance tools that help control and eliminate malaria.
This latest dataset, Pf7, is the latest iteration of our Plasmodium falciparum malaria parasite data. It was built with contributions from more than 150 authors from 33 different countries, spanning 1984 to 2018.
Why was the dataset created and what makes it unique?
Malaria is a complicated disease caused by a microscopic parasite transmitted between humans by mosquitoes. It kills hundreds of thousands of people every year, mostly young children and pregnant women in sub-Saharan Africa.
While there are several species of human malaria parasite, by far the most prevalent and deadly is Plasmodium falciparum. In 2021, it caused approximately 98% of the 247 million cases worldwide.
Wherever it appears, people use a variety of drugs, insecticides, and other strategies to stop its spread. However, these control measures exert different evolutionary pressures, which will show up in the genome.
By keeping track of large-scale genetic variation, we can spot when the parasite begins to evade our drugs, vaccines, or other measures.
The Pf7 dataset contains more than 20,000 whole genomes of P. falciparum samples, making it the largest malaria dataset of its kind. It’s also freely available to download at malariagen.net and can even be analysed in the cloud using a python package we produced.
What key findings came from the dataset?
I see this paper as a presentation of a large dataset. Pf7 is a springboard with preliminary analysis rather than a paper full of in-depth findings.
However, there are three findings that policymakers may wish to look at more closely:
- Drug resistance marker maps are included, and show surprising heterogeneity, even between nearby countries (e.g. parasites in Ghana are nearly all chloroquine-sensitive, while those in nearby Benin are nearly all resistant).
- Preliminary analysis shows that part of the protein used in the RTS,S and R21 vaccines is based on a gene variant that isn’t as closely matched to samples seen in the field as other possible variants.
- Data shows exactly where on the chromosomes hrp2 and hrp3 deletions that cause malaria rapid diagnostic tests to fail occur. There is a tendency for the deletions to involve the whole ends of chromosomes.
What impact do you hope this dataset will have on the epidemiology field and more widely?
This is yet another demonstration that genomic surveillance can be an effective tool for local disease control. By looking at a parasite’s genome, we can tell a lot of information about which drugs it might be resistant to. What’s more, by aggregating the data of many parasites, we can start to tease out trends in time and space.
If public health decision-makers can begin to implement the insights from this sort of data, they can make more informed choices about malaria prevention strategies. Genomic data can tell you if a drug or vaccine is likely to work much faster than more traditional randomised control trials.
Did you face any challenges during your study?
One of the biggest challenges for this study was integrating sequence information from dried blood spots. These sequences underwent selective Whole Genome Amplification (sWGA), and we needed to make sure that this extra step wasn’t introducing systematic errors.
Because the parasite DNA on bloodspots had to undergo more processing and amplification, there was a worry that the bloodspot data wouldn’t be as high-quality as the more traditionally-extracted venous blood DNA. With some careful analysis, this fear proved unfounded.
Why did you choose to publish your work with Wellcome Open Research?
Wellcome Open Research is our go-to publication for MalariaGEN data releases.
The MalariaGEN community is founded on open-access principles, transparency, and collaboration, and this is shared by Wellcome Open Research and its open publishing model, so it felt like a natural fit.
What’s next for this area of study?
We keep collecting, sequencing, and analysing malaria parasite genomes! The intention is to improve turnaround times so that we can publicly release data while they are relevant for public health decisions.
Read the full Research Article today on Wellcome Open Research to dive deeper into the study and findings, and discover related research in the dedicated Wellcome Sanger Institute Gateway.