introduction to the gramene genetic diversity module 5/2010 build #31
TRANSCRIPT
Introduction to the Gramene Genetic Diversity module
5/2010Build #31
• The Gramene Genetic Diversity database module specializes in storage of data sets that study genetic variation and genotype-phenotype association in populations of plants.
Rice
Maize
Arabidopsis
SorghumWheat
• Our focus: – Storage of large-scale SNP and
indel genotype datasets and accompanying phenotype measurements
– Facilitate discovery of associations between genes and traits
– Species of primary interest: maize, rice, Arabidopsis. We also house some important wheat and sorghum data
To find the Genetic Diversity module:- Go to: www.gramene.org, and click on one of the Diversity links
- Or, navigate go there directly: www.gramene.org.diversity
Click on “Diversity” thumbnail
Click on “Diversity”
link
What’s on the Diversity main pageBrowse data
Access latest Diversity datasets
Large-scale SNP datasets
Browse through SSR, RFLP and other smaller scale diversity or
QTL studies here
What’s on the Diversity main pageSearch data
Quick searchfor germplasm
or markersSNP QueryTool for miningSNP data
What’s on the Diversity main pageDownloading data
Download SNP data inhapmap, plink and flapjack format. Click on the DOWNLOAD link and we’ll take a look at the Download Data page on the
next slide…
Download subsets of SNP data by using the SNP Query tool (more
about this in a few slides)
Launch Diversity datasets live in
TASSEL. Download graphs and results
from analyses
All download options, including full Diversity
MySQL db dumps
• The large-scale SNP datasets are offered for download in hapmap, plink (.map and .ped files) and flapjack format. The plink files are loadable in the PLINK analysis program (pngu.mgh.harvard.edu/~purcell/plink) and the flapjack project files are loadable in Flapjack visualization tool (bioinf.scri.ac.uk/flapjack).
SNP download pagewww.gramene.org/diversity/download_data.html
Phenotype
Field/Plant information
Germplasm
Genotype
The Diversity module stores all of its data in MySQL using the GDPDM database model.
GDPDM links genotype, phenotype, germplasm, field and environment information in one resource.GDPDM v 4.0 is optimized for efficient handling of large scale SNP data by using BLOBs (binary database objects)
For full GDPDM documentation, go here: http://www.maizegenetics.net/gdpdm/documentation_list.html
SNP QUERYWeb-based tool for mining SNP datasets in Gramene Genetic Diversity
Features of the ‘SNP Data’ area of Diversity:SNP QUERY TOOL
Click SEARCH link to go to SNP Query
Select a species
Select a dataset to load
Select subset of plants in the
experiment by holding down the
SHIFT, ALT, or APPLE-CMD key while
clicking. The default (selecting none) will
return data for all accessions.
Select chromosome
Optionally enter start and stop coordinate for
range of interest
Select output format. ‘Text’ and ‘HTML’ will return the results in the browser, ‘File download’ will return results as a downloadable csv text
file.
Finally, click Submit
Here’s what the HTML results of SNP Query look like:
Hyperlinked to Gramene Genomes Ensembl browser
You can save the results to your hard disk as text
(some browsers allow you to save in HTML), or you can redo your query and
select File Download option
TASSELJava program for evaluating trait associations, evolutionary patterns,
and linkage disequilibrium
TASSELLaunch Diversity SNP datasets with Java web-start
Click on the LAUNCH TASSEL
link to access the web-start links
TASSELLaunch Diversity SNP datasets with Java web-start
www.gramene.org/diversity/tassel_launch.html
When you click on one of the hyperlinks, each of which represent one chromosome’s worth of SNP data, a dialog box
should appear asking you if you want to open Java Web Start to open the file – click OK. TASSEL will take a minute or two to
launch…
Once TASSEL launches, here is what you should see:
Notice the dataset tag ‘chr1’ appears in the
left bar. Click on ‘chr1’ to see the data….
When you click ‘chr1’ in left Data column, the data will be displayed in main window:
LD v DistanceDiversity
sliding windowPhylogeny LD
PCA of genotypes
With TASSEL you can perform many analyses. Please see TASSEL
documentation for much more information:
maizegenetics.net/tassel