Overview
• Concept• Hardware, Devices• Sound, Audio• Virtual Director• Whiteboard
– Cells, classification and background– Image filters– Key Framing
• Conclusions
Specification
• Simple, cheap hardware
• Maximal comfort for the participants
• No special pens, normal WB
Hardware
Ring camera:
• Array of 5 cheappixel cameras (~50$)
• Total of 3000x480pixels
• 360° view
• 8 microphones
• 1394 bus to server
Hardware
Whiteboard camera:
• Still, consumer-level 4MP camera:CanonG2
• One shot every 5 seconds
• MJPEG format via USB to server
Hardware
Meeting Room Server:
• Intel dual P4 2.2 Ghz
Archived Meeting Server:
• Intel dual P4 2.2 Ghz
Hardware
Kiosk:
• Simple switchboard to setup, start and stop the DM system
• Keycard reader for participants
Sound, Audio
• SSL: Sound source localization.Goal: which participant is speaking?
• Noise filtering:
• Background Noise (fans, server, etc)
• Reverbrations
• Beam forming:
• the microphone array virtually targets• helps dereverbrate audio
Virtual Director
• Closes up to speaker(s) in the „speaker window“
• Zooms 360° view
• Uses SSL and visual multi-person tracker as desicion base
• Has to make „good desicions“ on what to show. (instantly show speaker, show multiple speakers, not switch too often etc)
Whiteboard
Decision:
• Live camera with low resolution catches movements but misses content
• Still camera with high resolution catches WB content but misses movements (X)
Whiteboard
Requirements:
• No special drawing and erasing tools
• No keyframe marking button next to WB
• Fixed camera
• Cheap fully remote controllable camera Canon G2 with SDK with 4 MPixels
Whiteboard
Arising problems:
• Obscuring foreground objects
• Optical distortion of WB
• Unperfect white of WB
• Recognizing strokes
Whiteboard
Image Sequence analysis1. Rectify2. Extract WB bgcolor3. Cluster cell images4. Classify as:
{stroke, foreground object or WB}5. Filter cell images6. Extract key frame images7. Color-balance key frame images
Whiteboard: 1) Rectifying
• The corners of the WB are calibrated once per hand
• Anything else than WB is cropped
• The WB is bi-linear warped using bi-cubic interpolation
Whiteboard: 1) Rectifying
• The corners of the WB are calibrated once per hand
• Anything else than WB is cropped
• The WB is bi-linear warped using bi-cubic interpolation
Whiteboard: 2) Extracting BG color
• For every images, find bg color of every cell
• Parts may be obscured (holes)
• Must be accurate for final white-balancing
Whiteboard: 2) Extracting BG color
1. Strategy:
• Assumption:WB-cells are brightest
• Holes are filled with nearest neighbours
• May fail, ex: paper in foreground
Whiteboard: 2) Extracting BG color
2. Strategy
• Histogram of each cell (over time)
• Peaks are very likely WB BG
Whiteboard: 2) Extracting BG color
2. Strategy
• Histogram of each cell (over time)
• Peaks are very likely WB BG
• Detect „outliers“ with least-median-squares
Whiteboard: 2) Extracting BG color
2. Strategy
• Histogram of each cell (over time)
• Peaks are very likely WB BG
• Detect „outliers“ with least-median-squares
Whiteboard: 2) Extracting BG color
2. Strategy• Histogram of each
cell (over time)• Peaks are very
likely WB BG• Detect „outliers“
with least-median-squares
• Use neighbours for outliers again
Whiteboard: 4) Classifying
3 classes:
• White Board (background)
greyish: RGB values ~ equal
• Strokes
mostly grey with slight color in it
• Foreground objects (obscured)
anything else
Whiteboard: 4) Classifying
• The cell contents are compared to the previously computed backround color:
Whiteboard color
Whiteboard standard deviation
Current cells‘s mean color
Current cells‘s standard deviation
Whiteboard: 4) Classifying
• The cell contents are compared to the previously computed backround color:
whiteboard stroke foreground
Whiteboard: 5) Filtering
1. Reclassify isolated foreground cells as strokes
2. Reclassify strokecells next to foreground cells as foreground cells
Whiteboard: 5) Filtering
1. Reclassify isolated foreground cells as strokes
2. Reclassify strokecells next to foreground cells as foreground cells
Whiteboard: 6) Extracting key frames
• Key-frames should contain the „most important“ WB content
• The best moment to make a key-frame is right before a major erasure
Whiteboard: 6) Extracting key frames
• Key-frames should contain the „most important“ WB content
• The best moment to make a key-frame is right before a major erasure
Whiteboard: 6) Extracting key frames
Image reconstruction:
1. If cell image is WB or stroke, use it
2. If foreground object neighbours or obscures cell, search the cluster for the most recent valid cell image
3. If no cell image in the cluster is valid, replace it with WB color
Whiteboard:
• Every stroke cell receives a time-stamp where it is being drawed
• In the browser, every not yet drawed stroke cell is madevisible as „ghost
Whiteboard:
• Every stroke cell receives a time-stamp where it is being drawed
• In the browser, every not yet drawed stroke cell is madevisible as „ghost“
Whiteboard:
• Every stroke cell receives a time-stamp where it is being drawed
• In the browser, every not yet drawed stroke cell is madevisible as „ghost“
• By clicking on anystroke cell, thebrowsers jumps tothe correct time
Conclusions
• Works well for „cooperating“ drawer(complete oclusion of a full cluster is very unlikely – unless person stands perfectly still)
• Slider adapts „ghost“ transparency
• Postprocessing on modern machines takes ~ 1/3 of conference time
• Any region of the WB that is never exposed to camera is missed (trivial)
Conclusions
• Instead of a still camera, a high-res HDTV camera at high cost could be used
• DM does not yet:– Recognize pointing on WB– User actions (enter/exit rooms)– Use speech recognition to automate
transcripts– DRM to provide data access control