harvard it summit 2016 - opencast in the cloud at harvard dce- live and on-demand from the dce...
TRANSCRIPT
Extending Harvard to part-time learners with the academic ability, curiosity and drive to succeed at
Harvard.
Opencast in the Cloud at Harvard DCE
Video and Course Content Capture, Processing, Management, and Distribution
Software Architect and Director of Software Development
Our Reliability Requirements
● I am on-call 24/7/365
Our Current Architecture● Capture Agents (CAs) are the only machines on campus● Everything else is in AWS● Live content: CAs → Akamai● Video On Demand (VOD) content: CAs → Ingest to admin → processed on
workers → producers polish content → output files push to S3 → CloudFront
Goals:● Unusually high quality recording and playback
○ Sleek, modern player;
● Live streaming and Video On Demand (VOD)● Global audience - consistent experience regardless of timezone● Exceptionally reliable - capture, transcoding, and distribution● Fast processing time; volume of content increases most semesters● Robust - exception cases handled quickly and ideally automatically● Very responsive support
Goals● Tools for production staff, teaching staff, technical staff, operations:
○ Archive, republishing
○ Production team interface, including trimming, uploading content from other sources (e.g., Premier)
○ Workflow Browser, Student Viewing Analytics, Capture Agent Status Board,
● Automated deployments:○ Consistency between Production, Stage and Development clusters○ Means to run large scale experiments:
■ Confirm that new optimizations will work as expected in Prod■ Tune performance - storage, compute, networking configuration■ Track down bugs that only happen under heavy load, to test
At its root: an extensible workflow engine● “Workflows” are a series of “Workflow Operations” ● Workflow Operation = Java Class:
○ Transcode, automatically detect slide changes, etc○ Can do anything Java can do
● Workflows = XML file○ Series of operations listed in order, with some transitions, e.g., wait until producer sets start
and end points○ Different workflows for different use cases. E.g., live capture or inject content from Premier
● Current production workflows are ~20 steps long● It’s fair easy to handle new processing use cases:
○ Write a workflow○ Write any new Operations as needed
Architecture - Goal
.
.
.
A/V
Mira
cles Magic Students
Camera
Laptop
CameraLaptop
Camera
Laptop
Classrooms
Architecture - Capture Agents (CAs)
.
.
.
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
.
.
. A/V
Mira
cles Magic Students
The Search for Capture Apents (CAs)● Most 3rd party systems require that you buy their hardware and software:
○ Tight integration that you don’t have to worry about○ Limited by their box’s quality, resolutions, bitrates, number of streams, types of streams, etc○ They set the price, and declare the lifetime of your box (e.g., 3 years)○ Is this good? Bad? Depends on your use cases and funding
● Opencast’s CA API is fully open○ Any capture hardware that has an API○ Any capture software compatible with the CA API
The Search for Capture Apents (CAs)○ Capture Hardware options:
■ Vendor CAs (including their capture software) - Extron, NCast, DataPath,Teltek, etc■ Build it yourself, or have an OEM build it for you.■ General A/V boxes - KiPros, Epiphan Pearls, etc
○ Vendors’ Opencast specific CAs - Open Source Capture Agent (CA) software:■ Galicaster■ PyCA■ Harvard DCE’s mhpearl
○ The hard parts:■ Identifying the use cases you need to solve
■ Finding a CA that works well with your A/V setups in your classrooms, with your networking, etc
Epiphan Pearl ● Commodity A/V recorder
○ Maxwell Dworkin 119. Single Epiphan. Used by two different capture systems● Delightfully reliable● You can mix and match up to 4 sources into a single channel - side by side, picture in picture, etc. ● Multiple channels can be recorded and streamed at the same time. ● (Not used) Has the ability to do live switching between sources if an operator is available to run it during an event.
Add Opencast VOD; add multiple bitrate streams
.
.
.
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
.
.
. A/V
Mag
ic
Opencast
Engage API...
Students’ Browsers
Workflow Engine
Playback Resolutions and Bit RatesVideo On Demand (VOD):
● 720p @ 5Mbps● 540p @ 2Mbps● 360p @ 400Kbps● 180p @ 150Kbps
Live Streaming:
Doublewide - Presenter and Presentation in a single stream
● 1080p @ 5Mbps● 810p @ 2Mbps● 540p @ 400Kbps● 270p @ 150Kbps
Isn’t that a lot of data to push?● Mezz files are 20-30GB per hour of lecture● Use existing infrastructure:
○ Harvard’s Internet2 connection is >= 100gbps○ Amazon’s Internet2 connection is >= 40gbps
● Slowest hop:○ From classrooms to Harvard’s outgoing switches
Cluster Node Types1. Engage
a. Metadata search API1. Queried by the playback viewer, running in students’ browsers2. What courses am I in? For all courses, what presentations can I watch?
ii. Playback to producers pre-publish
2. Admina. Job coordinator; ingests
3. Workers4. Utility node5. Tools nodes: Workflow browser, Analytics
Add Live Streaming
.
.
.
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
.
.
. A/V
Mag
ic
Opencast Engage API ...
Students’ Browsers
Wowza
Add CDNs
.
.
.
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
.
.
. A/V
Mag
ic
Opencast Engage API ...
Students’ Browsers
Akamai
AWS CloudFront
Insert S3
.
.
.
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
CACamera
Laptop
.
.
. A/V
Mag
ic
Opencast Engage API ...
Students’ Browsers
Akamai
AWS CloudFrontAWS S3
Add redundancy for live streams
.
.
.
.
.
. A/V
Mag
icPrimary CA
Camera
Laptop Secondary CA
Primary CACamera
Laptop Secondary CA
Primary CACamera
Laptop Secondary CA
.
.
.
Akamai Primary Data Center
Akamai Secondary Data Center
Akamai Load Balancer
Architecture - Compute● Scale to arbitrary* load:
○ How powerful does each worker need to be?■ Bigger instances process faster, and cost more
○ Automated horizontal scaling:■ How many workers do you need?■ When do you need them?■ Currently time based scaling
The ways we use storage● “Hot” content - lecture content captured recently and awaiting processing and
publication by the production team - Zadara● “Warm” content -- S3 and S3 Infrequent Access● Archive -- republishing● CloudFront:
○ Engage as source of truth○ Engage to S3; S3 as source of truth
Student Experience: Player● Put the student in control:
○ Different layouts, can be changed throughout lecture○ Different resolutions and bitrates○ Control display of captions and transcripts○ Slidedeck download (where applicable)
● Code is from the open source Paella Project● Skinning is inspired by CS50
DCE OpsWorks for Opencast● Automated deployment● Automated config management● One command and you have a new cluster● Consistency between dev/stage/prod● Hard wall between clusters
Zadara is great● High performance NFS storage in AWS● It solved our storage problems● EFS, SoftNAS
AWS is awesome● It solves your problems:
○ EC2 solved our compute problems○ S3 met our “warm” and “cold” storage needs
■ S3 Infrequent Access cuts the cost by ⅔○ RDS solved our DB problems○ Instances never* go down. The network is always strong*.
● AWS has solutions for most of the things you don’t want to maintain○ Message queues, sending email, DBs, map-reduce, etc
● It allows you to do things that would normally be crazy hard and expensive:○ - Multi-AZ support; Multi-Region support
AWS is awesome● It enables you to develop faster
But...● Cost -- even pennies add up● AWS is complicated
○ VPC setup
● Some AWS behavior is unexpected:○ Spin up an instance for 3 minutes work - pay for an hour
○ Even on instances with guaranteed 10gbps throughput only get 1.6gbps inbound from the internet.
Thoughts on software development● Customers never know what they want● If your project is going well, you will have a huge backlog● People are the hardest part of software development
Software Architect and Director of Software Development