Assessing the Chinese CDC Preprint on the Coronavirus Pandemic
This blog provides an assessment of the Chinese CDC preprint on the coronavirus pandemic. Learn what the available analyses by two groups with access tell us, and find out what the author's summary of the preprint is.
Bloom Lab
Lab studying molecular evolution of proteins and viruses. Affiliated with @fredhutch @HHMINEWS @uwgenome. @jbloom_lab@mstdn.science
-
I’ve refrained from commenting on metagenomics of environmental samples from Huanan Seafood Market, since media coverage preceded description of data or analysis.
— Bloom Lab (@jbloom_lab) March 22, 2023
Data still not available, but there is now enough written down to offer partially informed assessment. -
TLDR is I agree w WHO’s @DrTedros (https://t.co/fXF30JSauG):
— Bloom Lab (@jbloom_lab) March 22, 2023
These data don’t tell us how pandemic began, but every bit of information helps.
Here is what I glean from the available analyses by the two groups w access. -
First analysis is pre-print Chinese CDC posted over year ago (Feb-25-2022) that describes sampling animals & environment at Huanan Market: https://t.co/VFz46D6mt4
— Bloom Lab (@jbloom_lab) March 22, 2023
It is moving thru peer review at glacial pace: @researchsquare status says still under review at a Nature journal. -
Below is my summary of Chinese CDC preprint when it posted last year: https://t.co/E5I0GhjZC2
— Bloom Lab (@jbloom_lab) March 22, 2023
Basically, it reports no animal samples were SARS2 positive, but some environmental samples were.
It concludes positive environmental samples consistent w shedding by infected humans. -
At the time it posted, I noted two caveats associated with Chinese CDC pre-print.
— Bloom Lab (@jbloom_lab) March 22, 2023
First caveat was that no raw data (FASTQ files) were available from sequencing of the samples: https://t.co/2oIfbvNQrn -
Second caveat is that all samples were collected on or after Jan-1-2020: https://t.co/URYqnnGnco
— Bloom Lab (@jbloom_lab) March 22, 2023
Human infections started in Wuhan no later than Nov 2019, which limits how much can be concluded from either positive or negative samples collected on Jan 1. -
Chinese CDC collected samples throughout market, but focused on southwest corner where animals sold. Because sampling focused on southwest corner, both negative (green) and positive (orange) environmental samples were mostly from that part of market. pic.twitter.com/edc9muqOci
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Chinese CDC reports testing samples for SARS2 by PCR & deep sequencing them. They report no tendency for the rate of PCR positive samples to be associated with any type of market vendor: similar positivity rates for stalls selling wildlife, vegetables, livestock, seafood, etc. pic.twitter.com/T49j9tbopy
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Table 1 summarizes environmental sample testing.
— Bloom Lab (@jbloom_lab) March 22, 2023
I’ve added highlighting & raccoon dog image next to sample Q61.
This sample was collected on Jan-12 & was negative for SARS2 by PCR but positive by NGS, which presumably means it had detectable SARS2 reads in deep sequencing. pic.twitter.com/c3SE4SA02Z -
Does fact Q61 was negative for SARS2 by PCR testing mean it had less viral RNA than samples w measurable Ct values? Unclear, because neither Chinese CDC pre-print nor subsequent analysis discussed below quantify fraction of deep sequencing reads that map to SARS2 for all samples.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
This brings us to the second analysis by Crits-Christoph et al which recently posted on Zenodo. This pre-print analyzes deep sequencing data for some samples the authors obtained via GISAID, although those data remain unavailable to others.https://t.co/gO3Xwqhe2l
— Bloom Lab (@jbloom_lab) March 22, 2023 -
The way Crits-Christoph et al obtained these data has itself become a controversy. I am not going to address that. Read the first 3 pages of Crits-Christoph report linked above & GISAID statement (https://t.co/Tpii459lfc) for opposing viewpoints.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Instead, I’ll focus on the substance of the Crits-Christoph analysis. Their analysis focused on the non-SARS2 metagenomic content of the deep sequenced environmental samples.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Key results are in Fig 1 of their analysis, shown below.
— Bloom Lab (@jbloom_lab) March 22, 2023
Contour heatmap shows fraction of positive samples from different regions of market.
Positive samples mostly from southwest corner, since that is where Chinese CDC mostly collected samples. pic.twitter.com/itzwm4saKa -
So contour plot just reflects where Chinese CDC collected samples (as shown in Fig 1A of their pre-print, mentioned earlier in this thread).
— Bloom Lab (@jbloom_lab) March 22, 2023
In other words, contour does not show percent positivity of collected samples, but total positive samples not normalized by total samples. -
The novel part of Fig 1 of Crits-Christoph et al is fraction of mammalian mtDNA reads in each analyzed sample that mapped to variety of mammalian species. These are pie chart circles.
— Bloom Lab (@jbloom_lab) March 22, 2023
There are samples w mtDNA from humans, pigs, sheep, cows, Siberian weasels, raccoon dogs, etc. pic.twitter.com/VoWnaAZUMn -
The sample that has attracted the most attention is the mostly light green pie circle, for which the dominant mtDNA is from a raccoon dog. This has attracted attention because raccoon dogs are one of several species known to be susceptible to SARS2.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
That green pie circle is sample Q61 from Chinese CDC pre-print, which was SARS2 negative by PCR but positive by deep sequencing. When data become available, I would like to quantify SARS2 reads in each sample, as it would be useful to know how much SARS2 was in each sample.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
For instance, it is unclear whether there is enough SARS2 in the raccoon dog mtDNA dominated Q61 sample to actually infer a viral sequence, or if the undetectable Ct value for that sample means there are just trace levels of reads that are too low to obtain a viral sequence.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
There are also lots of pie circles where the dominant mtDNA is from humans, & also from animals that were probably only sold as meat (eg, cow, sheep, pig). And although not shown in figure (since it’s not a mammal), the text reports some samples had fish mtDNA.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
So as Crits-Christoph et al note in text, fact that mtDNA from a species is found in a sample doesn’t mean that species was infected w SARS2. It just means material from that animal ended up in the same place as viral RNA, which was widespread in the Huanan Market by Jan 2020. pic.twitter.com/DPaZVypq7s
— Bloom Lab (@jbloom_lab) March 22, 2023 -
They also analyze a few samples in more detail, including assembling near-complete mtDNA for some animals, and some genomic / cDNA contigs for sample Q61. This could be useful for learning more details about animals from which genetic material in the market was derived.
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Overall, main thing we learn is details of which animals or products (eg, meats) were in market before it closed on Jan-1-2020.
— Bloom Lab (@jbloom_lab) March 22, 2023
This doesn’t tell us if any infected w SARS2. But knowing more about animals could help trace supply chain, which is valuable line of investigation. -
However, human SARS2 infections started in Wuhan no later than Nov 2019. So we have to be circumspect about environmental samples from Jan 2020.
— Bloom Lab (@jbloom_lab) March 22, 2023
Eg, raccoon dog sample Q61 was collected on Jan-12-2020, which is at least 6 and probably >8 weeks after first human infections. -
This brings me back to main caveat I noted about Chinese CDC pre-print when it posted last year: we are unlikely to get conclusive answers about origin of an outbreak that started in Nov 2019 (or earlier) by looking at samples collected in Jan 2020: https://t.co/edzi6dupOc
— Bloom Lab (@jbloom_lab) March 22, 2023 -
Analyses of Jan 2020 samples is definitely worthwhile, because as @DrTedros rightly noted each piece of data is important for better understanding initial outbreak in Wuhan.
— Bloom Lab (@jbloom_lab) March 22, 2023
But to identify actual origin of outbreak, we need details from Nov or early Dec 2019. -
Recall that there are descriptions of confirmed cases, both unlinked and linked to the market, dating back to Nov and very beginning of Dec 2019: https://t.co/Z7YrS9CeAV
— Bloom Lab (@jbloom_lab) March 22, 2023 -
See also this case timeline from an early paper: https://t.co/yUtIuiv08a
— Bloom Lab (@jbloom_lab) March 22, 2023
If we ever learn origin of SARS2, I suspect it will come from information on cases or events that preceded those first reported cases.