integrated data management for agricultural researchcbs/projects/2006_presentation...integrated data...

31
Integrated Data Management for Integrated Data Management for Agricultural Research Agricultural Research Diganta Nath, Intern Diganta Nath, Intern Dr. Rosemary Renaut, Committee Chair Dr. Rosemary Renaut, Committee Chair Director, Computational Biosciences PSM Director, Computational Biosciences PSM Dr. Jeffrey W. White, Internship Advisor Dr. Jeffrey W. White, Internship Advisor ALARC, USDA ALARC, USDA - - ARC, Maricopa, AZ ARC, Maricopa, AZ Dr. Hasan Davulcu, Committee Member Dr. Hasan Davulcu, Committee Member Dept of Computer Science & Dept of Computer Science & Engg Engg . .

Upload: others

Post on 04-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Integrated Data Management forIntegrated Data Management for

Agricultural ResearchAgricultural Research

�� Diganta Nath, InternDiganta Nath, Intern

�� Dr. Rosemary Renaut, Committee ChairDr. Rosemary Renaut, Committee Chair

•• Director, Computational Biosciences PSMDirector, Computational Biosciences PSM

�� Dr. Jeffrey W. White, Internship AdvisorDr. Jeffrey W. White, Internship Advisor

•• ALARC, USDAALARC, USDA--ARC, Maricopa, AZARC, Maricopa, AZ

�� Dr. Hasan Davulcu, Committee Member Dr. Hasan Davulcu, Committee Member

•• Dept of Computer Science & Dept of Computer Science & EnggEngg..

Page 2: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Goals of ProjectGoals of Project

�� Improve data management, analysis Improve data management, analysis

and distributionand distribution

•• GIS Analysis of GIS Analysis of Lesquerella fendleriLesquerella fendleri

•• GMS database for Lesquerella (LesquIS)GMS database for Lesquerella (LesquIS)

•• Web interface for LesquISWeb interface for LesquIS

Page 3: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Goals contd.Goals contd.

•• GMS database for Vernonia (VernIS)GMS database for Vernonia (VernIS)

•• Web interface for VernISWeb interface for VernIS

•• Excel workbook as per ICASA standardsExcel workbook as per ICASA standards

Page 4: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

The novel crop The novel crop -- LesquerellaLesquerella

•• Contains oil rich in Contains oil rich in hydroxyhydroxy fatty acid fatty acid

(HFA)(HFA)

•• Used in making resins, waxes, motor Used in making resins, waxes, motor

oils etc.oils etc.

Page 5: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Climate Analysis of Climate Analysis of L. fendleriL. fendleri

Distribution using DIVADistribution using DIVA--GISGIS

�� Data obtained from ALARC Data obtained from ALARC

collections, ASU Herbariumcollections, ASU Herbarium

�� Integrated into one database Integrated into one database –– 248 248

collectionscollections

�� Collection locations linked to climate Collection locations linked to climate

variablesvariables

Page 6: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DIVA contd.DIVA contd.

�� Frequency of mean temperature during Frequency of mean temperature during

the wettest quarter for collection sites of the wettest quarter for collection sites of

L. fendleriL. fendleri

Page 7: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DIVA contd.DIVA contd.

�� Frequency for precipitation of driest Frequency for precipitation of driest

quarterquarter

Page 8: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Climate Analysis algorithmsClimate Analysis algorithms

�� BIOCLIMBIOCLIM

•• Extracts climate data set for collection Extracts climate data set for collection

pointspoints

•• Computes mean and standard deviation Computes mean and standard deviation

from mean for each climatic variablefrom mean for each climatic variable

•• Builds an envelope identifying similar Builds an envelope identifying similar

areas based on percentile.areas based on percentile.

Page 9: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Algorithms contd.Algorithms contd.

�� DOMAINDOMAIN•• Based on GOWER distance Based on GOWER distance –– Relative measure Relative measure of similarity (Absolute distance/maximum of similarity (Absolute distance/maximum distance)distance)

•• Calculates GOWER distance between collection Calculates GOWER distance between collection points and each cellpoints and each cell

•• Generates a map based on similarityGenerates a map based on similarity

•• D = (1 D = (1 –– d ) * 100d ) * 100AB

Page 10: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DIVA DIVA -- BIOCLIMBIOCLIM

�� BIOCLIM Analysis BIOCLIM Analysis –– 3 variables3 variables

Page 11: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

BIOCLIM contd.BIOCLIM contd.

�� BIOCLIM Analysis BIOCLIM Analysis –– 4 variables4 variables

Page 12: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DIVA DIVA -- DOMAINDOMAIN

�� DOMAIN Analysis DOMAIN Analysis –– 3 variables3 variables

Page 13: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DOMAIN contd.DOMAIN contd.

�� DOMAIN Analysis DOMAIN Analysis –– 4 variables4 variables

Page 14: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

DIVA DIVA –– Next stepNext step

�� Collect more distribution dataCollect more distribution data

�� Involve soil chemistryInvolve soil chemistry

�� Assess other niche modeling Assess other niche modeling

methodsmethods

Page 15: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS GMSLesquIS GMS

�� ICIS databaseICIS database

•• GMSGMS

�� Import data from Excel into Excel Import data from Excel into Excel

spreadsheetspreadsheet

�� Implement standardized processImplement standardized process

Page 16: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

has progenitors

(Recursive

Links) has group

has source

Developed by

Called

Developed at

Developed by

With value of

Named by

METHODS

NAMES

LOCATIONS

USERS

USER-

DEFINED

FIELDS

ATTRIBUTES

Named at

Germplasm

Defined property

Assigned at

Assigned by

Generative Germplasm

Derivative Germplasm

GMS database

ER diagram

Page 17: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS GMS contd.LesquIS GMS contd.

�� Form for Loading new accessionsForm for Loading new accessions

Page 18: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS GMS contd.LesquIS GMS contd.

�� Form for loading AttributesForm for loading Attributes

Page 19: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS webLesquIS web

�� Custom web interface to search Custom web interface to search

Lesquerella germplasm recordsLesquerella germplasm records

�� Technology used Technology used ––

•• Microsoft Active Server PagesMicrosoft Active Server Pages

•• MySQL server databaseMySQL server database

�� Hosted on IIS serverHosted on IIS server

Page 20: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS web contd.LesquIS web contd.

�� MSMS--Access to MySQL conversionAccess to MySQL conversion

�� Custom tool Custom tool –– NavicatNavicat

�� MySQL Enterprise ManagerMySQL Enterprise Manager

Page 21: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

LesquIS web demoLesquIS web demo

�� Lesquerella Information SystemLesquerella Information System

Page 22: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

VernIS GMSVernIS GMS

�� Vernonia (ironweed)Vernonia (ironweed)

•• Contains oil rich in epoxy fatty acids.Contains oil rich in epoxy fatty acids.

•• Potential use as plasticizers and Potential use as plasticizers and

additives in PVC, drying agent in paints.additives in PVC, drying agent in paints.

•• Research initiated in ALARC in 1990.Research initiated in ALARC in 1990.

Page 23: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

VernIS contd.VernIS contd.

�� Collection Information in Excel.Collection Information in Excel.

�� Same methodology and tools used as Same methodology and tools used as in LesquIS.in LesquIS.

�� No changes to tools were necessary.No changes to tools were necessary.

�� VernIS GMS database implementedVernIS GMS database implemented

Page 24: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

VernIS WebVernIS Web

�� Similar to LesquIS webSimilar to LesquIS web

�� Used the same codeUsed the same code--base and base and

processprocess

�� Hosted on the same web server and Hosted on the same web server and

database serverdatabase server

Page 25: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

VernIS web demoVernIS web demo

�� Vernonia Information SystemVernonia Information System

Page 26: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Excel workbook for data collection Excel workbook for data collection

and data interchangeand data interchange

�� Preformatted workbook for collecting data Preformatted workbook for collecting data of field experimentsof field experiments•• Not all workers need or want a complex Not all workers need or want a complex system like ICISsystem like ICIS

�� Functionalities to import/export dataFunctionalities to import/export data

�� XML outputXML output

�� ICASA standardsICASA standards

Page 27: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Workbook demoWorkbook demo

Page 28: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Conclusion and Future directionConclusion and Future direction

�� DIVADIVA

�� LesquIS and VernISLesquIS and VernIS

�� ICASAICASA

Page 29: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

Special ThanksSpecial Thanks

�� Dr. Rosemary RenautDr. Rosemary Renaut

�� Dr. Jeffery W. WhiteDr. Jeffery W. White

�� Dr. Hasan DavulcuDr. Hasan Davulcu

�� Dr. Dave DierigDr. Dave Dierig

�� Pernell TomasiPernell Tomasi

�� Dr. Andrew SalywonDr. Andrew Salywon

Page 30: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

ReferencesReferences�� Anonymous 2003. Vascular Plant Herbarium, Arizona State UniversiAnonymous 2003. Vascular Plant Herbarium, Arizona State University. ty.

http://http://lifesciences.asu.edulifesciences.asu.edu/herbarium//herbarium/�� Bruskiewich, R.M., Bruskiewich, R.M., CosicoCosico, A.B., , A.B., EusebioEusebio, W., Portugal, A.M., Ramos, L.M., Reyes,, W., Portugal, A.M., Ramos, L.M., Reyes,

Ma.TMa.T., ., SallanSallan, M.A.B., , M.A.B., UlatUlat, V.J.M., Wang, X., McNally, K.L., Sackville Hamilton, R., Mc, V.J.M., Wang, X., McNally, K.L., Sackville Hamilton, R., McLaren, C.G. 2003. Linking genotype to phenotype: the Laren, C.G. 2003. Linking genotype to phenotype: the International Rice Information System (IRIS). Bioinformatics 19:International Rice Information System (IRIS). Bioinformatics 19: 6363--65.65.

�� Busby, J.R. 1991. BIOCLIM Busby, J.R. 1991. BIOCLIM -- a bioclimatic analysis and prediction system. Pp. 64a bioclimatic analysis and prediction system. Pp. 64--6868

in in MargulesMargules, C.R. and Austin, M.P. (, C.R. and Austin, M.P. (edseds) Nature Conservation: Cost Effective Biological Surveys and dat) Nature Conservation: Cost Effective Biological Surveys and data Analysis. Melbourne: CSIROa Analysis. Melbourne: CSIRO

�� Carpenter, G., Carpenter, G., GillisonGillison, A.N. and Winter, J. 1993. DOMAIN: a flexible , A.N. and Winter, J. 1993. DOMAIN: a flexible modellingmodelling

procedure for mapping potential distributions of plants and animprocedure for mapping potential distributions of plants and animals. Biodiversity and Conservation 2:667als. Biodiversity and Conservation 2:667--680.680.�� DeLacyDeLacy, I.H., McLaren, C.G., Fox, P.N., White, J.W. and , I.H., McLaren, C.G., Fox, P.N., White, J.W. and TrethowanTrethowan, R. The , R. The

Genealogy Management System Genealogy Management System http://www.icis.cgiar.org:8080/TDM/Docs/ICIS02G_GMS_Overview.DOChttp://www.icis.cgiar.org:8080/TDM/Docs/ICIS02G_GMS_Overview.DOC (verified Nov. 20, 2006)(verified Nov. 20, 2006)�� DelacyDelacy, I., and , I., and MicallefMicallef, S. Global Wheat Information System, S. Global Wheat Information System

http://mendel.lafs.uq.edu.au:8080/ICIS5/ABOUTGWIS.HTMhttp://mendel.lafs.uq.edu.au:8080/ICIS5/ABOUTGWIS.HTM (verified Nov. 20, 2006)(verified Nov. 20, 2006)�� Dierig, D.A., Tomasi, P., Salywon, A.M., Dierig, D.A., Tomasi, P., Salywon, A.M., DahlquistDahlquist, G.H., Isbell, T.A., Ray, D.T. 2005., G.H., Isbell, T.A., Ray, D.T. 2005.

Breeding strategies for improvement of lesquerella fendleri (Breeding strategies for improvement of lesquerella fendleri (brassicaceaebrassicaceae). pp 689). pp 689--697.697.�� ElithElith, J., H. Graham, Catherine, R. P. Anderson, M. , J., H. Graham, Catherine, R. P. Anderson, M. DudikDudik, S. Ferrier, A. , S. Ferrier, A. GuisanGuisan, R. J., R. J.,,

Hijmans, F. Hijmans, F. HuettmannHuettmann, J. R. , J. R. LeathwickLeathwick, A. Lehmann, J. Li, L. G. , A. Lehmann, J. Li, L. G. LohmannLohmann, B. A. , B. A. LoiselleLoiselle, G. , G. ManionManion, C. Moritz, M. Nakamura, Y. , C. Moritz, M. Nakamura, Y. NakazawaNakazawa, J. , J. McCMcC. . M. Overton, A. Townsend Peterson, S. J. Phillips, K. Richardson,M. Overton, A. Townsend Peterson, S. J. Phillips, K. Richardson, R. R. ScachettiScachetti--Pereira, R. E. Pereira, R. E. SchapireSchapire, J. , J. SoberonSoberon, S. Williams, M. S. , S. Williams, M. S. WiszWisz, and N. E. , and N. E. Zimmermann. 2006. Novel methods improve prediction of species' dZimmermann. 2006. Novel methods improve prediction of species' distributions from occurrence data. istributions from occurrence data. EcographyEcography 29:12929:129--151.151.

�� Fox, P.N., McLaren, C.G. and White, J.W. The International Crop Fox, P.N., McLaren, C.G. and White, J.W. The International Crop

Information System: Reflects the Information System: Reflects the InforamtionInforamtion--Intensive Nature of Modern Crop Research. Intensive Nature of Modern Crop Research. http://www.icis.cgiar.org:8080/TDM/Docs/ICIS01k_Introduction.DOChttp://www.icis.cgiar.org:8080/TDM/Docs/ICIS01k_Introduction.DOC (verified Nov. 20, 2006)(verified Nov. 20, 2006)

�� Franco, J., Crossa, J., Warburton, M.L., and Franco, J., Crossa, J., Warburton, M.L., and TabaTaba, S. 2006. Sampling Strategies for Conserving Maize Diversity Wh, S. 2006. Sampling Strategies for Conserving Maize Diversity When Forming Core Subsets Using en Forming Core Subsets Using Genetic Markers. Crop Sci 46: 854Genetic Markers. Crop Sci 46: 854--864.864.

�� Hijmans, R. J., Hijmans, R. J., SchreuderSchreuder, M., De la Cruz, J. and , M., De la Cruz, J. and GuarinoGuarino, L.. 1999. Using GIS to check, L.. 1999. Using GIS to check

coco--ordinates of ordinates of genebankgenebank accessions. Genetic Resources and Crop Evolution 46:291accessions. Genetic Resources and Crop Evolution 46:291--296.296.�� Hijmans, R.J., Hijmans, R.J., GuarinoGuarino, L., Jarvis, A., O'Brien, R., , L., Jarvis, A., O'Brien, R., MathurMathur P., C. P., C. BussinkBussink, M. Cruz, I. , M. Cruz, I.

BarrantesBarrantes and Rojas, E. 2005. DIVAand Rojas, E. 2005. DIVA--GIS, version 5.2. Manual. GIS, version 5.2. Manual. http://www.divahttp://www.diva--gis.org/DIVAgis.org/DIVA--GIS5_manual.pdfGIS5_manual.pdf

�� Hijmans, R.J. & D.M. Spooner, 2001. Geographic distribution of wHijmans, R.J. & D.M. Spooner, 2001. Geographic distribution of wild potato species.ild potato species.

AmerAmer J J BotBot 88: 210188: 2101––2112.2112.�� Hijmans, R.J., Cameron, S., and Hijmans, R.J., Cameron, S., and ParraParra, J., 2004. DIVA, J., 2004. DIVA--GIS Climate data from GIS Climate data from WorldclimWorldclim, http://, http://www.worldclim.orgwww.worldclim.org/, version 1.3, October 2004/, version 1.3, October 2004�� Hunt, L.A., White, J.W., Hoogenboom, G., 2001. Agronomic data: aHunt, L.A., White, J.W., Hoogenboom, G., 2001. Agronomic data: advances in dvances in

documentation and protocols for exchange and use. Agricultural Sdocumentation and protocols for exchange and use. Agricultural Systems 70, 477ystems 70, 477--492.492.

�� Hunt, L.A., G. Hoogenboom, J.W. Jones, J.W. White, 2006. ICASA VHunt, L.A., G. Hoogenboom, J.W. Jones, J.W. White, 2006. ICASA Version 1.0 Data ersion 1.0 Data

Standards for Agricultural Research and Decision Support. Standards for Agricultural Research and Decision Support. www.icasa.netwww.icasa.net/standards (verified Nov. 20, 2006)./standards (verified Nov. 20, 2006).�� McLaren, G., Bruskiewich, R., Metz, T. INTERNATIONAL RICE INFORMMcLaren, G., Bruskiewich, R., Metz, T. INTERNATIONAL RICE INFORMATION ATION

SYSTEM web. SYSTEM web. http://http://www.iris.irri.orgwww.iris.irri.org/(Verified/(Verified Nov. 20, 2006)Nov. 20, 2006)�� ReyesReyes--UlatUlat, M.T., Bruskiewich, R., , M.T., Bruskiewich, R., CosicoCosico, A. ICIS WEB INTERFACE (, A. ICIS WEB INTERFACE (ICISWebICISWeb))

http://www.icis.cgiar.org:8080/TDM/Docs/ICIS16A_ICIS_Web.dochttp://www.icis.cgiar.org:8080/TDM/Docs/ICIS16A_ICIS_Web.doc (verified Nov. 20, 2006)(verified Nov. 20, 2006)�� Thompson, A.E., D.A., Dierig, E.R. Johnson, G.H. Thompson, A.E., D.A., Dierig, E.R. Johnson, G.H. DahlquistDahlquist, and R. , and R. KleimanKleiman. 1994a.. 1994a.

Germplasm development of Vernonia galamensis as a new industrialGermplasm development of Vernonia galamensis as a new industrial oilseed crop.oilseed crop.

Indus. Crops Prod. 3:185Indus. Crops Prod. 3:185--200.200.

Page 31: Integrated Data Management for Agricultural Researchcbs/projects/2006_presentation...Integrated Data Management for Agricultural Research Diganta Nath, Intern Dr. Rosemary Renaut,

QuestionsQuestions

??