(hls303) how cloud computing is redefining research: secure, collaborative science at scale | aws...
DESCRIPTION
Genome sequencing technologies have lowered costs and increased data output at a much faster rate than traditional hardware refresh cycles can service effectively. With lowered costs, more biomedical research and clinical operations are integrating sequencing into a larger portion of their projects, compounding an already tough problem for IT operations. When Regeneron committed to using Next Generation Sequencing technology for biopharmaceutical research, the decision was made to keep the server room empty and deliver all genomic analysis and storage services using AWS and DNAnexus. We review the consequences of this decision, illustrating how basic AWS core services are game-changers in the design and operation of a scientific infrastructure at scale. We show how Amazon S3 changes the storage performance equation, discuss the impact of new instance types on the cost of genomic analysis, and explain how secure collaboration enables Regeneron to advance biopharmaceutical research techniques and support the future of genomic medicine.TRANSCRIPT
November 13, 2014 | Las Vegas, NV
Angel Pizarro, Amazon Web Services | Omar Serang, DNAnexus | Jeffrey Reid,
Regeneron Pharmaceuticals
Facilities
Physical security
Compute infrastructure
Storage infrastructure
Network infrastructure
Virtualization layer (EC2)
Hardened service endpoints
Rich IAM capabilities
Network configuration
Security groups
OS firewalls
Operating systems
Applications
Proper service configuration
Auth & acct management
Authorization policies
+ =
Customer/Partner
• Re-focus your security professionals on a subset of the problem
• Take advantage of high levels of uniformity and automation
First global public cloud provider to achieve certification for security & quality management system
EC2
10.0.2.12
AWS region – VPC network isolation
AZ A AZ B
VPC 10.0.0.0/16
SN 10.0.1.0/24 (DMZ) SN 10.0.2.0/24 (Private)
(23.20.103.11)
Internet
EC2
10.0.1.11
Internet GW Service
Virtual Gateway
AWS system
entitlements
Roles
Root Account
Multi-factor
authentication
Groups
Administrators Developers Applications
Alyson
Anand
Tomcat
Jim Shandra
Xiao
Susan
Reporting
Console
You get to choose who
can do what in your AWS
environment and from
where
Manage and operate
US EAST
A
VPC
Internet GW Service
Virtual GatewayB
SM
(EIP)
(EIP)
AWS account owner (master)
Network & security
Researcher Operations EMR
Amazon EBS
Amazon S3
• HTTPS
• Server-side encryption of each object
• AWS-managed keys
• Customer-managed keys
• End-to-end secure network traffic
• Whole volume encryption
• AWS-managed keys
• Encrypted incremental snapshots
• Minimal performance overhead (utilizes Intel AES-NI)
• Records API calls, no matter how those API calls were
made (console, SDK, CLI)
• Who did what and when and from what IP address
• Logs saved to Amazon S3
• Includes Amazon EC2, Amazon EBS, Amazon VPC,
Amazon RDS, IAM, AWS STS, and Amazon Redshift
• Be notified of log file delivery by using the Amazon
Simple Notification Service (Amazon SNS)
• Aggregate log information across services into a
single S3 bucket
• Out of the box integration with log analysis tools
from AWS partners including Splunk, AlertLogic, and
SumoLogic
Virtual Private
Cloud
AWS Cloud
Amazon
S3Amazon
EBS
IAM
CloudTrail
DNAnexus is a platform-as-a-service for genome
informatics. In production now with over 20% of the high
throughput sequencing installations globally.
Enables collaboration and datasharing within a secure
and compliant environment that helps attract new users.
Creates a “Network Effect” for genomics.
Platform launched in 2013. Located in Silicon Valley.
* Source: JP Morgan
Baylor College of Medicine
x14,691
INDIVIDUAL
GENOMES
x430TBANALYTIC
DATA
x3.3MCORE-HOURS
x20,800
CORES
x300+RESEARCHERS
1 customer. 1 project. 1 week.
x
GEISINGER
x
REGENERON
$30B pharma company
focused on cholesterol,
oncology, and rheumatoid
arthritis
$3.5B health care
provider for
3M patients
Amazon S3: the center of the universe
Amazon EC2: job manager and payloads
Amazon EC2: application services
Leveraging the AWS detailed billing report
VPC + Dx: connecting partners
DNAnexus: connecting science
Samples (extracted DNA)
1.4M Sample BioBank
Library Prep Automation
Illumina HiSeq Fleet
Data
Data QC
40
140
30
50
9/28/1998 9/28/1999 9/28/2000 9/28/2001 9/28/2002 9/28/2003 9/28/2004 9/28/2005 9/28/2006 9/28/2007 9/28/2008 9/28/2009 9/28/2010 9/28/2011 9/28/2012 9/28/2013
40
140
9/28/1998 9/28/1999 9/28/2000 9/28/2001 9/28/2002 9/28/2003 9/28/2004 9/28/2005 9/28/2006 9/28/2007 9/28/2008 9/28/2009 9/28/2010 9/28/2011 9/28/2012 9/28/2013
http://bit.ly/awsevals