coronavirus positivity predicting bat
Post on 29-May-2022
3 Views
Preview:
TRANSCRIPT
The Dark Knight - Timothy Quek, Ryan Kim, Peter Wang, Isaac Law
Predicting Bat Coronavirus Positivity
Background
● Bats comprise ~20% of mammal species (> 1,400 species)
● Serve as reservoir hosts of many deadly viruses (e.g. Ebola, Hendra, Nipah)
● The SARS-CoV-2 virus that led to the current COVID-19 pandemic likely originated from an Asian bat species
● Scientists do research on bats worldwide to study relationships between bats and viruses
Predict factors that make it more likely for a particular bat species to be a potential coronavirus reservoir host
- Geographical and Environmental Characteristics- Morphological and Other Biological Traits- Phylogenetic Group
Problem Statement
Data Analysis
01.Bat CoV PositivityDataset manually collected from 100+ published papersLook for coronavirus positivity rates among samples from bats
PanTHERIADataset on global mammalian species-level dataset of life-history, ecological and geographical traits
02.
Datasets
Bat Ecology / Viral DiversityBat specific dataset used in a study on viral diversity and reservoir status in a Canadian study
04.
EltonTraits 1.0Dataset on global species-level foraging attributes of mammals
03.
Zoonotic Infectious DiseasesDataset used in a study on zoonotic emerging infectious diseases, including geographical / environmental features
05.
Main Features Included: 13 features selected out of 72
Weather- Weather cluster (Precipitation + Temperature)- Actual / Potential Evapotranspiration Rate
Location Cluster (approximately corresponds to continent)- Geographical cluster
Phylogenetic Cluster (56 million years ago)- Factorized cluster
Land Use and Environment- Land-Barren- Evergreen Broadleaf- Managed Vegetation- Crop Change
Species Diversity / Livestock- Mammalian Diversity, - # of Poultry (log)- # of mammals livestock (log)
Human Population- Human Population Density change (for each grid cell)
Weather- Precipitation and Temperature
● Each dot represents one particular bat species● High prevalence: ≥5% coronavirus positivity (red dots); Low prevalence: <5% positivity (blue dots)
Low PrevalenceHigh Prevalence
Effect of Temperature and Humidity on Coronavirus Infectivity
Adv Virol 2011;2011:734690. doi: 10.1155/2011/734690. Epub 2011 Oct 1.
Land Use and Environment
Land Use and Environment- Proportion of Barren land- Broadleaf Evergreen Forest- Managed Vegetation- Cropland Change
● Higher proportion of barren land in the geographical distribution of bat species associated with higher coronavirus prevalence
Phylogenetic Cluster (56 million years ago)
● PhyloClust56-Phylogenetic clusters based on evolutionary relationships between bats 56 million years ago
● “PC3” showed a lower coronavirus positivity compared to the other phylogenetic clusters on univariate analysis
Ecology of Mammals
Ecology of Mammals- Mammalian Diversity- # of Poultry (log)- # of mammals livestock (log)
● (1) Mammalian diversity and(2) Poultry / mammalian livestock headcounts show statistically significant relationships with bat coronavirus prevalence
Mammalian Diversity and Emerging Infectious Diseases
● High mammalian biodiversity is associated with lower prevalence of bat coronavirus positivity
● Previous research- biodiversity loss increases disease transmission● Mechanism is unclear- one speculation:
○ Species better at buffering disease transmission are affected more with biodiversity reduction
○ Conversely, species with higher rates of reproduction (and spend less resources on host immunity) may survive longer during reductions in biodiversity
Nature. 2010 Dec 2;468(7324):647-52.
Geographical Location
Location Cluster - Bat species found in the location cluster corresponding to Africa have a higher coronavirus positivity (after correction for other factors)
※ Red dots : ≥5% positivity※ Blue dots: <5% positivity
Modeling and Results
● Prevalence Rate Modeling- Poisson Regression
● High vs Low Coronavirus Prevalence Classification- Generalized Boosted Model
Two Main Analyses
Feature Correlation & Feature Engineering
● Modelled outcome-Number of positive bats (out of 100 bats)
● Stepwise forward inclusion based on AIC
● RMSE ~ 5.5
● Reasonable fit- except under-fitting at both extremes
Poisson Regression - Count Response Modeling
● Model accuracy: ~ 74% ● GBM does not provide p-values or
coefficients, but ranks variables by relative influence
● Mammal & poultry ecological variables have heavy influence on bat coronavirus positivity (mammal biodiversity, mammal livestock / poultry headcount)
● Consistent with previous studies
Generalized Boosted Model - Binary Classification
● Mammalian biodiversity plays an important role in both models● Bats in geographical ranges with HIGHER mammal biodiversity => lower CoV prevalence● Weather, land use and ecological factors come after mammalian diversity
Model InferenceRegression Classification
Predictions
● Prediction Process:Construct 95% C.I. with Poisson Regression, cross-check with GBM model
● Bat species flagged as “high CoV risk” when both models converge
● Attempted to predict the coronavirus risk in Rhinolophus bats- thought to be a major reservoir of SARS related coronaviruses
Predictions
Model PredictionsRhinolophus inops
Rhinolophus subrufus
Findings● Factors increasing the risk of high bat
coronavirus prevalence include reduced mammalian diversity and low temperature / humidity
● Weather, land use and ecological factors have higher explanatory power than bat characteristics
● Our models predict that 5 species of Rhinolophus bats from the Philippines likely have a high coronavirus prevalence
● Deforestation and destruction of animal habitats likely contribute to the higher incidence of emerging infectious diseases
● The importance of the loss of mammalian diversity to predict the outcome likely reflects this point specifically
● “...global changes in the mode and the intensity of land use are creating expanding hazardous interfaces between people, livestock and wildlife reservoirs of zoonotic disease.”
Footnote
Nature. 2020 Aug; 584(7821): 398-402.
● Professor Maria Cristina Rulli, Politecnico di Milano● Professor Paolo D’Odorico, University of California, Berkeley● Dr. Amanda Adams, Bat Conservation International● Dr. Natasha Spottiswoode, University of California, San Francisco● Authors of all the papers that we used in this Capstone project
● Dr. Fred Nugen, University of California, Berkeley● Dr. Alberto Todeschini, University of California, Berkeley● Our wonderful section mates● Our families● The bats
ACKNOWLEDGEMENTS
CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik
THANKS!http://bat-cov-positivity.org/home
Main Features to Include
Weather- Mean Precipitation- Mean Temperature (squared)
● Using K-means clustering, put bat species into 2 clusters based on temperature and precipitation
● Monthly mean precipitation / temperature of species habitat:Low temperature and precipitation associated with higher bat coronavirus prevalence
Contributions By Team Members
All of the team members contributed actively in the following areas of the project:
Development and refinement of conceptLiterature ReviewCoordination with ProfessorsData collectionData Cleaning and missing Data imputationVisualizationsMachine Learning AlgorithmsWebsite DesignWriting of Paper
EDA of Selected Features in Final Merged Dataset
Change in Human Population Density
Human Population- Rate of Change in Human Population Density between 1990 and 1995
● “HuPopDen_Chg” shows the rate of change of human population density between 1990 and 1995
● Interestingly, a lower change in human density (between 1990 to 1995) tends to be associated with a higher bat coronavirus prevalence
Land Use and Environment
Land Use and Environment- Land-Barren- Evergreen Broadleaf- Managed Vegetation- Crop Change
● Change in land use for cropland (grid cell, 1900-2000) and the proportion of area covered by barren land/ evergreen/ cultivate vegetation show significance
Possible Future Studies (and Capstone Projects?)
● Using species distribution and land use data, predict potential intermediate hosts that may result in coronavirus spillover infections from bats to humans
● Choosing a specific bat related zoonosis with well mapped out index cases, aim to predict areas with a high likelihood of future cases
top related