paul wang soed 2016

32
Video-based Big Data Analytics in Cyberlearning Shuangbao (Paul) Wang, Ph.D. Professor, Director Center for Security Studies

Upload: colleen-ganley

Post on 14-Apr-2017

688 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Paul Wang SOED 2016

Video-based Big Data Analytics in Cyberlearning

Shuangbao (Paul) Wang, Ph.D. Professor, Director

Center for Security Studies

Page 2: Paul Wang SOED 2016

1913

1920 * 1080 * 50 * 60 -- 3000 * 2M -- 6G

2016

Page 3: Paul Wang SOED 2016

Videos in Cyberlearning – big data

• Video use is growing rapidly in education (and elsewhere) • MOOCs (ex. EdX, Coursera, Udacity) rely on videos • Huge repositories (ex. RBDIL, NSDL) contain

extraordinary amounts of valuable video data • Videos are big data, unstructured

– Hardly being analyzed by current data analytics tools

• Cyberlearning requires more interactions.

Page 4: Paul Wang SOED 2016

RBDIL – Rutgers University

Page 5: Paul Wang SOED 2016

NSDL – National Science Digital Library

Page 6: Paul Wang SOED 2016
Page 7: Paul Wang SOED 2016
Page 8: Paul Wang SOED 2016

Using Videos Effectively in Learning

• Interactive? (DoD) • Assessments ? (adaptive) • Easy for instructors to use? (course development) • Accessibility? (Mac, mobile) • Track students’ growth? (over multiple years) • Long videos vs. short ones? (crop) • Recording methodology? (noises and echo)

Page 9: Paul Wang SOED 2016

inVideo - A Novel Big Data Analytics Tool for Video Data Analytics

• Analyzing Video by Keywords • Content Based Image Retrieval (CBIR) • Pattern Recognition (PR) • Multiple Languages

Page 10: Paul Wang SOED 2016

inVideo: Analyzing Video by Keywords

• Audio is stripped and used to generate a transcript • Transcript is indexed back to original media • Video is now searchable/mineable by keyword

Result shows that 7 video clips from three videos were retrieved for keyword “online”

Page 11: Paul Wang SOED 2016

inVideo: Content Based Image Recognition (CBIR)

• Provide a picture reference • Search video content (frames) that contains the reference picture • Return the video clips

Result shows that the match is at 0.05th sec. in video named “student”

Page 12: Paul Wang SOED 2016

inVideo: Pattern Recognition (PR)

• Provide a keyword reference • Search video content (frames) that contains the object described as

“keyword”

The results shows three videos were retrieved that contain objects look like the keyword “credit card”

Page 13: Paul Wang SOED 2016

inVideo: Analyzing Different Languages

• Input keywords in other languages • Search transcript for keywords in that language • Retrieve video clips that match

The results shows two video clips in one video contain the keyword “学生” (the word “student” in Chinese).

Page 14: Paul Wang SOED 2016
Page 15: Paul Wang SOED 2016

Clip

#0 Introduction

Clip

#1 Pwd cracking

Clip

#2 Port scan

Clip

#3 Encryption

Clip

#4 Forensics

Clip

#5 Cyber

weapon

Video: Linear – to -- Interactive

Before: A 46-minute long video

After: 2-3 minute video clips with assessments in between and at the end

Assessments

Page 16: Paul Wang SOED 2016

• Learning objects composed of short video clips

• Assessment of learning outcomes of studying video content

• Teachers: selecting a video segment and assign Q&As

Drag the stage bar and click “From” button; continue dragging and then click “To” button. Add a question and answers.

Define “Learning Objects” (Instructors)

Page 17: Paul Wang SOED 2016

Learning and Assessments

• View the whole video, and take a quiz

• Review the video clip corresponding to the question

Click “Review” button to review the video clip; click the speak icon to speak out the question; click “Confirm” to check your answer

Page 18: Paul Wang SOED 2016

Case Study: Cybersecurity Program

Student Engagement for the 24 Classrooms

inVideo: Turn videos into interactive learning contents

Page 19: Paul Wang SOED 2016

Low Accuracy

Video1: 45 Video Clips

Video1: 29 Video Clips

Video3: 29 Video Clips Video1: Individual Video 2: Small Class Video 3: Full Classroom

Page 20: Paul Wang SOED 2016

Accuracy of transcripts of 9 video clips from three original videos

Page 21: Paul Wang SOED 2016

SDLC

Accuracy comparison: • “hits rate” before –

standard parameters; • “hits rate” after –

revised parameters

No improvement!

Page 22: Paul Wang SOED 2016

Correlations?

low accuracy vs. recording methods • Low accuracy

– 10% or less – Individual22, (45+31+29 video clips)

• Medium accuracy – 40 to 60% – EdX_EDM, EdX_ajax, (20 video clips)

• High accuracy – 90% or higher – Phone.p2, online_shopping, (30 video clips)

Page 23: Paul Wang SOED 2016

Online Shopping A=90%

Individual 22 A=10%

rfeb07 A=10%

phone.p2 A=90%

Page 24: Paul Wang SOED 2016

Phone.p2 A=90%

r002 A=10%

edX. ajax A=60%

edX.EDM A=50%

Page 25: Paul Wang SOED 2016

Voice-over re-Recording

• Re-recorded voices on videos

• Merge audio track with original videos

• Signal analyzing while recording

• Accuracy significantly improved!

Page 26: Paul Wang SOED 2016

Correlations

• Low accuracy is expressed in high quefrency – A measurement of ambient noises – echo

• Recording methods – One microphone (per person) – Used condenser microphone instead of dynamic one

• Recording setting could affect the audio quality (for digital processing) – Experiments – Guide to digital recording

Page 27: Paul Wang SOED 2016
Page 28: Paul Wang SOED 2016
Page 29: Paul Wang SOED 2016

Transcript Time-stamping System (TTS)

• Adding timestamps for already transcribed videos

• Fuzzy Search

Page 30: Paul Wang SOED 2016

Further Discussions

• Web API • Search progression (over the years) • Voice cancellation/reduction • Automatic Time-stamping • Curriculum Development • Build community - collaborations

Page 31: Paul Wang SOED 2016

Publications

Page 32: Paul Wang SOED 2016

Shuangbao (Paul) Wang, Ph.D.

[email protected]

William Kelly, Metonymy Corporation

Xiaolong Cheng, Doctoral Candidate, George Washington University