predicting food web connectivity phylogenetic scope, evidence thresholds, and intelligent agents...
TRANSCRIPT
Predicting food web connectivityPhylogenetic scope, evidence thresholds, and intelligent agents
Cynthia Sims ParrEcological Society of America Memphis, TN August 8, 2006
Bacteria
Microprotozoa
Amphithoe longimana
Caprella penantis
Cymadusa compta
Lembos rectangularis
Batea catharinensis
Ostracoda
Melanitta
Tadorna tadorna
ELVIS: Ecosystem Localization, Visualization, and Information System
Oreochromis niloticusNile tilapia
??
. . .
Species list constructor
Food web constructor
ELVIS’s Food Web Constructor predicts basic network structure
Prelude to systems models
Food Web
G
S
node
link
Evolutionary tree
step
G
taxonS
taxon
A
Evolutionary Distance Weighting1. Set distance thresholds
2. Find relatives of target nodes X, Y with known link status
E.g. relative A is close to X, relative B close to Y
where Link Value between A and B is known
3. For each found link, compute weight based on distance
4. Compute certainty index for a predicted link by combining weighted link values, with a discount for negative evidence
AB
XA XA YB
1
1 ( ) ( )YB
WeightDistance Penalty Distance Penalty
XY
1
( )i
Ni
i
weightCertaintyIdx LinkValue
discount
Food web database
Source Webs Nodes Links
Animal Diversity Web n/a 711 2165
EcoWEB 212 4503 11967
Webs on the Web 19 1373 12056
Interaction Web DB 26 2139 9882
Tuesday Lake 2 101 510
Total 259 8827 36580
4600 distinct taxa
Food web data: Cohen 1989, Dunne et al. 2006, Vazquez 2006, Jonsson et al. 2005
Evolutionary tree: Parr et al. 2004. + plants from ITIS + hierarchy of non-taxonomic nodes
Testing the algorithm Take each web out of the database Attempt to predict its links Compare prediction with actual data
Accuracy percentage of all predictions that are correct 89%
Precision percentage of predicted links that are correct 55%
Recall percentage of actual links that are predicted 47%
Choosing parameters
30 web subsample Representative of habitats, years, # nodes,
percent identified to species Iterate over parameter settings Tradeoff between
Precision percentage of predicted links that are correct
Recall percentage of actual links that are predicted
Evolutionary distance threshold2 steps up and 4 steps down
1 2 3 4S1
S40.3
0.35
0.4
0.45
0.5
1 2 3 4S1
S30.4
0.45
0.5
0.55
0.6
steps up
steps down
precision
steps up
recall
Evolutionary direction penalty not very sensitive
AB
XA XA YB
1
1 ( ) ( )YB
WeightDistance Penalty Distance Penalty
ancestor
descendent
siblings
Negative evidence discount is sensitive
XY
1
( )i
Ni
i
weightCertaintyIdx LinkValue
discount
0.2
0.3
0.4
0.5
0.6
0.7
0 25 50 75 100
Negative evidence discount
Recall
Precision
Results over all webs
Is evolutionary distance weighting better than strict database search?
Paired T-testsdf=251
***p<0.001
Database searchEvolutionary distance weighting
%
***
***
***
Database search is more precise, but evolutionary distance wt has better recall.
Older webs contribute
Recall percentage of actual links that are predicted 47% 48% with no EcoWEB data
Precision percentage of predicted links that are correct 55% 39% with no EcoWEB data
…but large webs are harder to predict
0
0.2
0.4
0.6
0.8
1
1.2
0 50 100 150 200
Number of taxa
Recall r
ate
large webs have better taxonomic resolution
0
20
40
60
80
100
120
0 50 100 150 200
Number of taxa
% id
enti
fied
to
sp
ecie
s
recent webs are bigger
0
20
40
60
80
100
120
140
160
180
1910 1930 1950 1970 1990 2010
Year of study
nu
mb
er
of
taxa
large webs have fewer unknown “taxa”
0
10
20
30
40
50
60
70
80
90
0 50 100 150 200
Number of Taxa
% t
axa
un
kno
wn
Some phyla are easier to predict than others
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Annelida
Arthro
poda
Bacill
ario
phyta
Chordata
Mollu
sca
Phylum
Re
ca
ll ra
te
Trait space distance weighting
Euclidean distance in natural historyN-space
Parameterize functions from the literature that might predict links using characteristics of taxa. For example, size or stoichiometry.
LinkStatusAB= ƒ(α, sizeA, sizeB), ƒ(β, stoichA, stoichB) …
…need more data
How can we do better predicting links?
ETHANEvolutionary Trees and Natural History ontology
Animal Diversity Webhttp://www.animaldiversity.org geographic range habitats physical description reproduction lifespan behavior and trophic info conservation status
“Esox lucius” hasMaxMass “1.4 kg”
“Esox lucius” isSubclassOf “Esox”
“Esox” eats “Actinopterygii”
Triples
UMBC Triple Shop QueryWhat are body masses of fishes that eat fishes?
Enter a SPARQL querySELECT DISTINCT ?predator ?prey ?preymaxmass ?predatormaxmass
WHERE {
?link rdf:type spec:ConfirmedFoodWebLink .
?link spec:predator ?predator .
?link spec:prey ?prey .
?predator rdfs:subClassOf ethan:Actinopterygii .
?prey rdfs:subClassOf ethan:Actinopterygii .
OPTIONAL { ?predator kw:mass_kg_high ?predatormaxmass } . OPTIONAL { ?prey kw:mass_kg_high ?preymaxmass }
}
. . . leaving out the FROM clause
UMBC Triple Shop Create a datasetFind semantic web docs that can answer query.
Actinopterygii.owl
webs_publisher.php?published_study=11
Esox_lucius.owl
http://swoogle.umbc.edu
UMBC Triple Shop Get results Apply query to dataset with semantic reasoning.
http://sparql.cs.umbc.edu/tripleshop2/
Food Web Constructor uses evolutionary approach and large databases
We chose parameters using subsample Explored results over entire database
Evolutionary distance weighting recalls links better than database search
Older webs are useful Large webs harder to predict Some phyla are easier than others to predict
For future algorithms, we can gather and integrate data via ontologies and intelligent agents
Summary
UMBC: Tim Finin, Joel Sachs, Andriy Parafiynyk, Li Ding, Rong Pan, Lushan Han, UMCP: David Wang, RMBL: Neo Martinez, Rich Williams, Jennifer Dunne, UC Davis: Jim Quinn, Allan Hollander
UMMZ Animal Diversity Web: Phil Myers, Roger Espinosa
UMCP: Bill Fagan, Bongshin Lee, Ben Bederson
http://spire.umbc.edu
ADW databaseMySQL
XSLTtemplate
ADW taxon acctHTML
KeywordsHTML
ETHANTaxonacctOWL
SPIRE taxon databaseMySQL
EvolutionaryTree side of ontologyOWL
Phylum-sizedET chunkOWL
Taxon PathOWL
Filters
Acct data tabulartext
Others
ITIS
ETHAN workflow
Plants, etc.
Animal name tree
KeywordsOWL
Semantic Prototypes In
Ecoinformatics
UMBCUMBC
U Maryland U Maryland
NASAGoddard
NASAGoddard
Rocky MtnBio Lab
Rocky MtnBio Lab
UC DavisUC DavisSemantic Web Tools
Info. Retrieval AgentsFood Web ConstructorEvidence Provider
Invasive Species Forecasting System
Remote Sensing Data Food WebsEcological Interaction
Ontologies
Species List constructor
Food Web Constructor example Nile Tilapia in St. Marks
QuestionWhat are potential predators and prey of Oreochromis niloticus in the St. Marks estuary in Florida?
ProcedureSubmit species list for St. Marks, with Oreochromis niloticus added.
http://spire.umbc.edu/fwc
Food Web Constructor generates possible links
Evidence provider gives details
Nile tilapia – what organisms could be impacted?
Implications: parameterized functions
Requires good data for target species Can incrementally add natural history functions to
get better estimate, try different functions from literature or use genetic algorithms
Parameterizing functions: multivariate statistics, machine learning, fuzzy inference
Could use evolutionary info if you localize parameter estimates to clades or taxonomic subsets
LinkPredictedCD = ƒ(α , sizeC,sizeD) + ƒ(β , stoichC,stoichD)
Distance weighting options
Evolutionary Uses phylogeny or classification or
combination of these – assumes related organisms like each other
Distance could be branch length or # steps
Does not need natural history data
2 steps
Y
3 changes
X
“TaxonA” hasBreedingDuration “5 months”
OntologiesRicher way to design databases: instances of concepts that have well-defined meanings and formal relationships.
“Taxon A” hasAgeOfSexualMaturity “1 year”
“Higher Taxon” lives in “Australia”
“Taxon B” lives in “Australia”
“Taxon A” lives in “Australia”
Breeding Season
Reproductive Characteristic
TaxonB
Breeding Duration
is-a
has-a
Sexual maturity
is-a
HigherTaxonis-a TaxonA
Age of Sexual Maturity
has-a
is-a