harvard it summit 2016 - opencast in the cloud at harvard dce- live and on-demand from the dce...

31
Extending Harvard to part-time learners with the academic ability, curiosity and drive to succeed at Harvard.

Upload: kevindonovan

Post on 12-Apr-2017

419 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Extending Harvard to part-time learners with the academic ability, curiosity and drive to succeed at

Harvard.

Page 2: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Opencast in the Cloud at Harvard DCE

Video and Course Content Capture, Processing, Management, and Distribution

Page 3: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

[email protected]

Software Architect and Director of Software Development

Page 4: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Our Reliability Requirements

● I am on-call 24/7/365

Page 5: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Our Current Architecture● Capture Agents (CAs) are the only machines on campus● Everything else is in AWS● Live content: CAs → Akamai● Video On Demand (VOD) content: CAs → Ingest to admin → processed on

workers → producers polish content → output files push to S3 → CloudFront

Page 6: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Goals:● Unusually high quality recording and playback

○ Sleek, modern player;

● Live streaming and Video On Demand (VOD)● Global audience - consistent experience regardless of timezone● Exceptionally reliable - capture, transcoding, and distribution● Fast processing time; volume of content increases most semesters● Robust - exception cases handled quickly and ideally automatically● Very responsive support

Page 7: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Goals● Tools for production staff, teaching staff, technical staff, operations:

○ Archive, republishing

○ Production team interface, including trimming, uploading content from other sources (e.g., Premier)

○ Workflow Browser, Student Viewing Analytics, Capture Agent Status Board,

● Automated deployments:○ Consistency between Production, Stage and Development clusters○ Means to run large scale experiments:

■ Confirm that new optimizations will work as expected in Prod■ Tune performance - storage, compute, networking configuration■ Track down bugs that only happen under heavy load, to test

Page 8: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

At its root: an extensible workflow engine● “Workflows” are a series of “Workflow Operations” ● Workflow Operation = Java Class:

○ Transcode, automatically detect slide changes, etc○ Can do anything Java can do

● Workflows = XML file○ Series of operations listed in order, with some transitions, e.g., wait until producer sets start

and end points○ Different workflows for different use cases. E.g., live capture or inject content from Premier

● Current production workflows are ~20 steps long● It’s fair easy to handle new processing use cases:

○ Write a workflow○ Write any new Operations as needed

Page 9: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Architecture - Goal

.

.

.

A/V

Mira

cles Magic Students

Camera

Laptop

CameraLaptop

Camera

Laptop

Classrooms

Page 10: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Architecture - Capture Agents (CAs)

.

.

.

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

.

.

. A/V

Mira

cles Magic Students

Page 11: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

The Search for Capture Apents (CAs)● Most 3rd party systems require that you buy their hardware and software:

○ Tight integration that you don’t have to worry about○ Limited by their box’s quality, resolutions, bitrates, number of streams, types of streams, etc○ They set the price, and declare the lifetime of your box (e.g., 3 years)○ Is this good? Bad? Depends on your use cases and funding

● Opencast’s CA API is fully open○ Any capture hardware that has an API○ Any capture software compatible with the CA API

Page 12: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

The Search for Capture Apents (CAs)○ Capture Hardware options:

■ Vendor CAs (including their capture software) - Extron, NCast, DataPath,Teltek, etc■ Build it yourself, or have an OEM build it for you.■ General A/V boxes - KiPros, Epiphan Pearls, etc

○ Vendors’ Opencast specific CAs - Open Source Capture Agent (CA) software:■ Galicaster■ PyCA■ Harvard DCE’s mhpearl

○ The hard parts:■ Identifying the use cases you need to solve

■ Finding a CA that works well with your A/V setups in your classrooms, with your networking, etc

Page 13: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Epiphan Pearl ● Commodity A/V recorder

○ Maxwell Dworkin 119. Single Epiphan. Used by two different capture systems● Delightfully reliable● You can mix and match up to 4 sources into a single channel - side by side, picture in picture, etc. ● Multiple channels can be recorded and streamed at the same time. ● (Not used) Has the ability to do live switching between sources if an operator is available to run it during an event.

Page 14: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Add Opencast VOD; add multiple bitrate streams

.

.

.

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

.

.

. A/V

Mag

ic

Opencast

Engage API...

Students’ Browsers

Workflow Engine

Page 15: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Playback Resolutions and Bit RatesVideo On Demand (VOD):

● 720p @ 5Mbps● 540p @ 2Mbps● 360p @ 400Kbps● 180p @ 150Kbps

Live Streaming:

Doublewide - Presenter and Presentation in a single stream

● 1080p @ 5Mbps● 810p @ 2Mbps● 540p @ 400Kbps● 270p @ 150Kbps

Page 16: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Isn’t that a lot of data to push?● Mezz files are 20-30GB per hour of lecture● Use existing infrastructure:

○ Harvard’s Internet2 connection is >= 100gbps○ Amazon’s Internet2 connection is >= 40gbps

● Slowest hop:○ From classrooms to Harvard’s outgoing switches

Page 17: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Cluster Node Types1. Engage

a. Metadata search API1. Queried by the playback viewer, running in students’ browsers2. What courses am I in? For all courses, what presentations can I watch?

ii. Playback to producers pre-publish

2. Admina. Job coordinator; ingests

3. Workers4. Utility node5. Tools nodes: Workflow browser, Analytics

Page 18: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Add Live Streaming

.

.

.

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

.

.

. A/V

Mag

ic

Opencast Engage API ...

Students’ Browsers

Wowza

Page 19: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Add CDNs

.

.

.

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

.

.

. A/V

Mag

ic

Opencast Engage API ...

Students’ Browsers

Akamai

AWS CloudFront

Page 20: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Insert S3

.

.

.

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

CACamera

Laptop

.

.

. A/V

Mag

ic

Opencast Engage API ...

Students’ Browsers

Akamai

AWS CloudFrontAWS S3

Page 21: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Add redundancy for live streams

.

.

.

.

.

. A/V

Mag

icPrimary CA

Camera

Laptop Secondary CA

Primary CACamera

Laptop Secondary CA

Primary CACamera

Laptop Secondary CA

.

.

.

Akamai Primary Data Center

Akamai Secondary Data Center

Akamai Load Balancer

Page 22: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Architecture - Compute● Scale to arbitrary* load:

○ How powerful does each worker need to be?■ Bigger instances process faster, and cost more

○ Automated horizontal scaling:■ How many workers do you need?■ When do you need them?■ Currently time based scaling

Page 23: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

The ways we use storage● “Hot” content - lecture content captured recently and awaiting processing and

publication by the production team - Zadara● “Warm” content -- S3 and S3 Infrequent Access● Archive -- republishing● CloudFront:

○ Engage as source of truth○ Engage to S3; S3 as source of truth

Page 24: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Student Experience: Player● Put the student in control:

○ Different layouts, can be changed throughout lecture○ Different resolutions and bitrates○ Control display of captions and transcripts○ Slidedeck download (where applicable)

● Code is from the open source Paella Project● Skinning is inspired by CS50

Page 25: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

DCE OpsWorks for Opencast● Automated deployment● Automated config management● One command and you have a new cluster● Consistency between dev/stage/prod● Hard wall between clusters

Page 26: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Zadara is great● High performance NFS storage in AWS● It solved our storage problems● EFS, SoftNAS

Page 27: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

AWS is awesome● It solves your problems:

○ EC2 solved our compute problems○ S3 met our “warm” and “cold” storage needs

■ S3 Infrequent Access cuts the cost by ⅔○ RDS solved our DB problems○ Instances never* go down. The network is always strong*.

● AWS has solutions for most of the things you don’t want to maintain○ Message queues, sending email, DBs, map-reduce, etc

● It allows you to do things that would normally be crazy hard and expensive:○ - Multi-AZ support; Multi-Region support

Page 28: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

AWS is awesome● It enables you to develop faster

Page 29: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

But...● Cost -- even pennies add up● AWS is complicated

○ VPC setup

● Some AWS behavior is unexpected:○ Spin up an instance for 3 minutes work - pay for an hour

○ Even on instances with guaranteed 10gbps throughput only get 1.6gbps inbound from the internet.

Page 30: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

Thoughts on software development● Customers never know what they want● If your project is going well, you will have a huge backlog● People are the hardest part of software development

Page 31: Harvard it summit 2016  - opencast in the cloud at harvard dce- live and on-demand from the dce classroom

[email protected]

Software Architect and Director of Software Development