persistent and hackathons smart india hackathon the...© 2017 i4c 3 tm smart india hackathon smart...

Persistent and Hackathons

Smart India Hackathon

Digital Transformation

Our country needs audaciousdigital transformation to

reach its potential

25% of India between age of 16-25

Over 300 Million young people

the largest exporter of information technology services and skilled manpower among developing countries

…but a vast majority of its population still lacks the skills to meaningfully participate in the digital economy.

Smart India Hackathon

Smart India Hackathon is a non-stop digital product development

competition, where problems are posed to technology students for

innovative solutions. It…

Harnesses creativity & expertise of students

Sparks institute-level hackathons

Builds funnel for ‘Startup India’ campaign

Crowdsource solutions for improving governance and quality of life

Provide opportunity to citizens to provide innovative solutions to India’s daunting problems

World’s biggest open innovation model !!How it works

Students submit proposals

Problem statements released by ministries

Evaluation + selection of finalists

Smart India Hackathon finals

Ingredients for a successful hackathon

• Executive sponsorship• Theme identification

• Broad areas of problem domains, solution requirements identified• Technology available:

• Communication and collaboration• Platforms, hardware• Internal and external resources• Data

• Enablement• Coaches• Demos • Licenses

• Internalization• Innovation as a habit• Traction

• Contact: sinaihackathon@persistent.com shriram@persistent.com

VEGA: Healthcare data platform

Introduction: SanGeniX is a comprehensive and user friendly NGS data analysis solution. It’s rich graphical user interface (GUI) which enables user to perform primary, secondary and tertiary analysis of NGS data in a seamless fashion.

Business Problem: NGS data analysis is very complicated and requires specialized bioinformatics as well as computation resources.

Domain: Genomics

Areas of application: Personalized medicine, Clinical diagnostics, Animal genomics, Plant genomics, and Microbial genomics

Impact: Simplifies analysis and interpretation of NGS data for biologists.

SanGeniX: NGS data analysis solution

SanGeniX: NGS data analysis with workflows and visualization

Benefits

•Integration of industry standard algorithms ensures data reliability

•Automation - less hands-on time.

•Point-and-click - Easy to use

•Intuitive and innovative data visualization

•Role based access to ensure security and compliance

•Scalable

Business Value

•Increased productivity•Cost effective due to cloud hosted

pay per use solution•Highly customizable to include

customer specific workflows•Allows focus on biological

questions and analysis instead of pondering over data•HIPAA Compliance

Important features

•Predefined workflows for all the common NGS workflows

•Integration of industry standard algorithms

•Flexibility to include custom algorithms

Graphically enriched visualization

www.persistent.com

A comprehensive literature mining solution

for biomedical literature

www.biogyan.com

Introduction

User’s Query

• Gene names

• Biological processes

• Cell-type/Tissue

Collate & Process PubMed articles

• Compute article relevance

• Highlight searched terms & sentences

• Generate holistic graphs & tables

User Action

• Study results

• Mark validity, add comments

• Export results to Spreadsheet, EndNote, BibTeX

Key features:• Allows combinatorial search of genes, cell-types and cellular processes

• Ranks scientific articles based on relevance to the search terms

• Displays biological pathways and 3D structures of queried genes

• Allows offline reading of articles

• Available at: http://biogyan.com/

BioGyan - Screenshots

Biological pathway & 3D structure

Distribution of articles for a queried geneQueried genes & their participation

Result Screen highlighting ranked articles

Market Space

• Life Sciences Research Space:• Academic research labs• Research Hospitals• Pharma/Biotech research labs

• Key market players:• All first and second tier Universities and Pharma/Biotech companies

• Industry Issues:• Literature mining is an integral part of R&D, drug discovery and regulatory filings.

• Key Services:• Licensed product• Literature mining service• Development of customized solution

www.persistent.com

Multiomics Project

Need of the hour: Omics data integration

and analysis for precision medicine

Adapted from http://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-015-0108-y

Multi-omics: Integrative biology

Meaningful insights

Disease: Biological make up and environment

Accurate diagnosis, prognosis and

treatment

Precision

Medicine

Multi-omics Platform

Development of a comprehensive

multi-omics data analysis platform

Implementation of diverse analysis

pipelines for addressing use-cases

such as biomarker discovery and

disease sub-typing

Data analytics and advanced

visualization for deriving actionable

outcomes

Open for collaboration

Multi-omics platform for deriving actionable

insights from big biological data

Inter-relationships between omic data (e.g. genomics,

transcriptomics, proteomics, metabolomics)

Leverage bioinformatics, data analytics and machine

learning

Address diverse use-cases (e.g. diagnostic, drug

discovery, precision medicine)

LakeNetwork

Analysis

Analytics

Machine

Learning

Multi-omics approach for biomarker prediction

from large scale data

Biomarker discovery pipelines: Multiple Approaches

Biomarker Prediction: Tertiary/Downstream analysis

Transcriptomics data:mRNA , miRNA

Genomics data(CNV, SNP, etc)

Proteomics data

Clinical data

Statistics/Bioinformatics analysis Pattern recognition: Machine

Learning

Pattern recognition: Data integration methods/tools

Input layer: Multiomics datasets (multiple samples)

Analysis layer: Multiple approach

Features (genes) selection: Statistically significant genes Features (genes) selection:

Contribution to model buildingPrediction layer:

Putative biomarker prediction

Individual dataset Individual/combined dataset Combined dataset

Epigenetics data(Methylation, ChIPSeq, etc)

Metabolomics data

Statistical Approach for biomarker prediction

Outcome: List of cross talking

pathway pairs and network

Tool: Cross-Talk module

Genomics (CNV, SNP)

Transcriptomics(MRNA, MiRNA)

Proteomics(RPPA, Ms /MS)

Metabolomics(LC/GC-MS)

Epigenomics(Methylation, ChIP-Seq)

Protein Identification

Metabolite Identification (HMDB, Mass Bank, METLIN)

Normalization & Analysis(Gistic2, CNTOOLS, MAFTOOLS)

Normalization & Analysis(Limma, EdgeR, DESeq2)

Normalization & Analysis(Galaxy-P, PGA)

Normalization & Analysis(SIMCA-P, R)

Normalization & Analysis(Limma, MACS)

Significant genes/ Genomic regions

Significant Genes/miRNAs

Significantproteins

Significantmetabolites

Significant genomic regions (Methylation, Enriched sites)

Input Layer

Individual Omicanalysis layer

Integrative layer

IntegratedAnalysis

Correlation Analysis | Clustering / Cluster of clusters | Multivariate Analysis | Pathway analysis / Network Analysis

Biomarker Prediction / Tertiary analysis

Literature Annotation

Network based analysis Annotation from Databases

Outcome: Literature based gene process

associationsTool: BioGyan

Outcome: List of risk pathways

Tool: iSubPathway

Outcome: PPI network, pathway

enrichment, functional

annotationsTool: REACTOME FI

Outcome: Annotations like molecular function, diseases, drugs,

pathways, TFBS etc.Tool: ToppGene

Prediction layer

Machine Learning for biomarker prediction: Approach 1

Transcriptomics(mRNA, miRNA)

Protein Identification Metabolite Identification (HMDB, Mass Bank, METLIN)

Normalization & Analysis(Limma, edgeR, DESeq2)

Significant genes/miRNAs

Significant proteins Significant metabolites

Input Layer

Integrative layer

IntegratedAnalysis

Correlation Analysis | Clustering / Cluster of clusters | Multivariate Analysis | Pathway analysis / Network Analysis

Literature Annotation Network based analysis Annotation from Databases

Outcome: Literature based gene process associations

Tool: BioGyan

Tool: iSubPathwayMiner

Outcome: List of cross talking pathway pairs

and networkTool: Cross-Talk module

Outcome: PPI network, pathway enrichment,

functional annotationsTool: REACTOME FI

Outcome: Annotations like molecular function, diseases, drugs, pathways, TFBS etc.

Tool: ToppGene

Model1

Feature Selection

Model2

Feature Selection

Model3

Feature Selection

Model4

Feature Selection

Model5

Feature Selection

Machine learning

Nearest NeighborsLinear SVMRBF SVMGaussian ProcessDecision TreeRandom Forest

Neural NetworksAdaBoostNaive BayesQDALogistic Regression

Genomics(CNV, SNP)

Significant genes/ genomic regions

Prediction layer

Machine Learning for biomarker prediction: Approach 2

Machine Learning layer

Tool: BioGyan

Tool: ToppGene

Feature Selection

Combined Model Machine learning Nearest Neighbors

Linear SVMRBF SVMGaussian ProcessDecision TreeRandom Forest

Two or more data set combined

Correlation Analysis | Clustering / Cluster of clusters | Multivariate Analysis

Genomics (CNV, SNP)

Integrated Input Layer

Prediction layer

Data Integration methods for biomarker prediction

Integrated layer

Tool: BioGyan

Tool: ToppGene

mixOmics MOcluster iClusterplus

Cluster membership of samplesFeature Ranking Visualization: Heatmaps, plots, etc

OncoImpact

Driver Genes

Genomics (CNV, SNP)

Input Layer

Prediction layer

Current Use Case:

TCGA Breast cancer data analysis for biomarker

prediction

Gene expression data(read counts per gene)

NormalTNBC

Non-TNBC

Data integration methods

MOclustermixOmicsiClusterPlus

High scoring genes

Biomarker prediction: Tertiary analysis

TNBC biomarkers from literature

Machine learning

Automated feature selection

Statistical analysis

Statistically significant genes

DESeq2edgeRLimma

PubMed TCGA (Breast cancer datasets)

Data filtering: Quality control

Copy number variation data(Segmentation mean data)

Statistical analysis

Machine learning

GISTICCNTools

Statistically significant genes

Automated feature selection

Data filtering: Quality control

Machine learning

Automated feature

selection

766 samples 766 samples

831 samples

831 samples 825 samples 825 samples

persistent and hackathons smart india hackathon the...© 2017 i4c 3 tm smart india hackathon smart...

Documents

geeta group of i nstitutions › wp-content › uploads ›...

smart city hackathon - tu kaiserslautern · 2018-04-20 ·...

incadea india hackathon 2017

&evdbujpo %fqbsunfou - gtu.ac.ingtu.ac.in/uploads/problem...

matrusri.edu.in · 2020-02-06 · internal hackathon for...

smart india hackathon · 2020-02-08 · smart india...

college development committee · 2019-04-25 · smart india...

【smart life hackathon...

aicte - smart india hackathon 2018 result · amritanagar,...

at&t smart cities hackathon

at&t mobile app hackathon (smart city) - berkeley

what is smart india hackathon 2017 - national institute of...

achievements.docx · web viewthe aicte chairman...

ibm india cloud hackathon - sastra.edu hackathon/ibm... ·...

smart india hackathon-2020 data/ioe... · 2020-02-17 ·...

cvrce ’18 - cvr.ac.incvr.ac.in/home4/newsletters/jan-mar...

aicte - smart india hackathon 2018 result · the project...

smart streets 24hr hackathon presentation slides

vaagdevi engineering college · internal hackathon for...

· web viewstudent’s activities held during oct’2019...