Download - Apan media encoding
Video encoding for Web and Archive
High Definition Video Working Group APAN Bandung 2014
Andrew Howard - The Australian National University
Background• Significant Digital Humanities media asset library
stored on film, video, audio tapes, portable hard drives, CD, DVD and Blu-Ray
• Some assets reaching end of life requiring ongoing preservation activity
• Also need to handle media stored on online services like youtube and vimeo
• Build on experience with encoding for Digital Lecture Delivery system and iTunesU
Problem Space• Media degradation
• Maintain quality, original format, encapsulation of propriety playback system/application/OS environment using emulator and/or virtual machine
• Encode to an industry standard archive format at high bitrate to support re-encoding using evolving compression standards
• Deliver to a range of target playback systems
• Organise, Identify and Describe assets and the content of assets
• Cost of ingest, conversion, classification, storage and delivery
Some actual collections
Seismology data !(586 DAT tapes)
Media Degradation• Digital Media
• DVD
• Physical damage
• Dye process
• Storage
• Hard drives
• Physical damage
• Magnetic coherence
Media Degradation• Tape media (Video, Reel to Reel, DAT, Magtape)
• Storage
• Physical
• Replay devices
Content Management• Organisation, Classification and Description of
assets and the content of the assets
• E-Culture WG Session on Linked Data
Media preparation Video
• Video tape
• Format: PAL,NTSC,SECAM,HDV
• Aspect ratio: 4:3, 16:9, 3:2, 8:5, Anamorphic
• Frame type: Interlaced or Progressive
• Pixel format: Rectangular or Square
Media preparation Video
• Video tape - General Information
• Clean the VCR heads regularly
• Use a video enhancer hardware device like the Canopus to provide additional signal stabilisation, chroma correction and retiming
• Adjust VCR tracking
• Use highest available device resolution for capture
• Use highest available device connect for capture
• DV
• S-Video
• Composite
Media preparation DVD
• DVD
• Format: PAL,NTSC,SECAM,HDV
• Aspect ratio: 4:3, 16:9, Anamorphic
• Frame type: Interlaced or Progressive
• Region coding and DRM
DVD Encoding Tools• Older tools:
• (Windows)
• DVD Decrypter
• DVD Shrink
DVD ingest• Experienced problems on both commercial and
user created DVD media from both controlled and uncontrolled environments
• Best results using a Blu-Ray drive to read media which standard DVD drives failed to read
Encoding Tools• Contemporary tools (OSX & Windows):
• Handbrake
• DVD decoding
• DVD and file Encoding into many formats
• VLC
• The “Swiss Army Knife” for media
Cataloging, Tagging and Identification
• XMP:Description
• MP3 tags
• iTunes tags
• Tools
• exiftool
• read and write asset metadata
• mkvinfo
• Face and Object recognition with CoreImage and OpenCV
Command line tools
• ffmpeg
• VLC
• vpxenc
• MKVToolNix
ffmpeg recipesGenerate a JPEG poster frame from the video at #SECONDS from start (15-20) is typical. !ffmpeg -i {INPUT} -y -f mjpeg -vf scale="320:trunc(ow/a/2)*2" -vframes 1 -ss {#SECONDS} {OUTPUT} !
ffmpeg recipesTheora Video @1.2M, Vorbis Audio @128k !ffmpeg -i {INPUT} -y -codec:v libtheora -b:v 1200k -qscale:v 6 -codec:a libvorbis -qscale:a 5 -b:a 128k -ar 22050 {OUTPUT}
ffmpeg recipesH.264 Video @10Mbs, AAC Audio @384k, Lossless, width: preserve !ffmpeg -i {INPUT} -metadata media_type=10 -metadata hd_video=0 -threads 0 -acodec libfaac -ac:a 2 -b:a 384000 -vcodec libx264 -pix_fmt yuv420p -b:v 10240k -preset veryslow -tune film -qp 0 -movflags +faststart {OUTPUT}
ffmpeg recipesH.264 Video @1.2Mbs, AAC Audio @128k, scaled to height: 320, width: matching input ratio ffmpeg -i {INPUT} -metadata media_type=10 -metadata hd_video=0 -threads 0 -acodec libfaac -ac:a 2 -b:a 128000 -vcodec libx264 -pix_fmt yuv420p -b:v 1200k -vf scale="320:trunc(ow/a/2)*2" -profile:v main -preset medium -crf 18 -level 3.1 -movflags +faststart {OUTPUT}
ffmpeg recipesWebM (VP8) Video @1.2Mbs, Vorbis Audio @128k !ffmpeg -i {INPUT} -y -threads 8 -codec:v libvpx -qscale:v 6 -b:v 1.2M -codec:a libvorbis -crf 10 -qscale:a 5 -b:a 128k -ar 22050 !!
Required for multi threading
-threads 0 doesn’t work
Playback• Target: Web browsers & mobile devices using
HTML5 <VIDEO> and <AUDIO>
Firefox IE Chrome Safari/Webkit
H.264/AAC x xWebM x x x
Theora/Vorbis x x
FLV x x
MP3 x x x
YouTube html5
Web playback HTML5 and flash fallback for H.264
• Examined a range of open source players
• Projekktor, osmplayer, JWplayer and MediaElement
• Selected MediaElement for quality of API, documentation, support of SRT subtitles and plugin support
• mediaelementjs.com
Archive• Maintain original format to provide for re-code at
later time
• Generate a high quality 2-pass H.264 version
• Generate a high quality DVD version
Future Codecs• Increased range of macro block forms
• Larger inter frame comparison
• Decreased file sizes allow better bandwidth utilisation for existing assets and the delivery of higher definition and clarity operating on existing transmission systems
• H.265/HEVC
• VP9
• Jan 2014 code release
VP9• Google next generation codec
• libvpx code available
• Latest VLC and Chrome will play
• YouTube is a significant market driver
VP9• Google next generation codec
• Original video size: 108,887,661 (108.9Mb)
• x264 encode fps:
• VP8
• Single pass ffmpeg encode size: 122,716,927 (122.7Mb) includes Audio
• vpxenc 2pass size: 24,488,846 bytes (24,5Mb) Video only,
• encode fps:
• Pass 1/2 frame 3857/3858 555552B 1152b/f 27327b/s 131187 ms (29.40 fps)
• Pass 2/2 frame 3857/3857 24452079B 50717b/f 1202781b/s 118922 ms (32.43 fps)
• VP9
• Single pass encode vpxenc --codec=vp9 -t 7 -o APAN_demo_nasa.vp9.webm -w 1280 -h 720 --cpu-used=4 -p 1 --target-bitrate=1200 —kf-max-dist=360 APAN_demo_nasa.vp8_1.y4m
• Pass 1/1 frame 3857/3857 24813041B 51465b/f 1220537b/s 998347 ms (3.86 fps)
• vpxenc 2 pass encode fps:
File size comparison preliminary testing results
Video Sample MOV Mb AVI Mb Encode
FPS
Original 108,887,661 108 591,552,512 591.5
vp8 39,153,628 39.1 444,498,772 444.4 ~30
vp9 31,741.303 31.7 216,773,028 216.7 ~3-4
x264 97,624,542 97.6 379,207,254 379.2 ~450
VP8 and VP9 tools• vpxdec
• Extract a yuv4 uncompressed video • vpxdec --progress --postproc --mfqe -t 7 -o APAN_demo_nasa.vp8.y4m APAN_demo_nasa.vp8.webm
• mkvextract tracks
• mkvextract tracks "APAN_demo_nasa.vp8.webm" 1:APAN_demo_nasa.vp8.ogg
• mkvmerge
HEVC/H.265• svn checkout https://hevc.hhi.fraunhofer.de/svn/
svn_HEVCSoftware/tags/HM-1.0/ HM-1.0
• Still testing encoding
HM software: Encoder Version [1.0][Mac OS X][GCC 4.2.1][64 bit] !Input File : APAN_demo_nasa.vp8_1.y4mBitstream File : APAN_demo_nasa.vp8_1.binReconstruction File : APAN_demo_nasa.vp8_1_enc.yuvReal Format : 1280x720 30HzInternal Format : 1280x720 30HzFrame index : 0 - 8 (9 frames)Number of Ref. frames (P) : 1Number of Ref. frames (B_L0) : 1Number of Ref. frames (B_L1) : 1Number of Reference frames : 1CU size / depth : 128 / 5RQT trans. size (min / max) : 4 / 32Max RQT depth inter : 2Max RQT depth intra : 1Motion search range : 128Intra period : 32QP : 32.00GOP size : 8Rate GOP size : 8Bit increment : 4Luma interpolation : Samsung 12-tap filterChroma interpolation : Bi-linear filterEntropy coder : CABAC!TOOL CFG: ALF:1 IBD:1 HAD:1 SRD:1 RDQ:1 SQP:0 ASR:0 PAD:0 LDC:0 NRF:1 BQP:0 GPB:0 FEN:0 RQT:1 MRG:1 !POC 0 ( I-SLICE, QP 32 ) 928 bits [Y 68.0431 dB U 71.3615 dB V 99.9900 dB] [ET 32 ] [L0 ] [L1 ] POC 8 ( P-SLICE, QP 33 ) 123656 bits [Y 39.3042 dB U 43.1461 dB V 45.9746 dB] [ET 77 ] [L0 0 ] [L1 ] POC 4 ( B-SLICE, QP 34 ) 17248 bits [Y 40.1223 dB U 44.9601 dB V 47.3997 dB] [ET 412 ] [L0 0 ] [L1 8 ] POC 2 ( B-SLICE, QP 35 ) 6576 bits [Y 44.9149 dB U 48.0378 dB V 51.2781 dB] [ET 351 ] [L0 0 ] [L1 4 ] POC 6 ( B-SLICE, QP 35 ) 10208 bits [Y 37.7147 dB U 42.0799 dB V 45.0517 dB] [ET 479 ] [L0 4 ] [L1 8 ] POC 1 ( B-SLICE, QP 36 ) 984 bits [Y 63.3660 dB U 50.5459 dB V 51.4867 dB] [ET 195 ] [L0 0 ] [L1 2 ] POC 3 ( B-SLICE, QP 36 ) 2376 bits [Y 40.5029 dB U 45.3647 dB V 48.8884 dB] [ET 398 ] [L0 2 ] [L1 4 ] POC 5 ( B-SLICE, QP 36 ) 1864 bits [Y 37.1593 dB U 42.6453 dB V 45.4117 dB] [ET 322 ] [L0 4 ] [L1 6 ] POC 7 ( B-SLICE, QP 36 ) 5856 bits [Y 35.2758 dB U 40.2356 dB V 43.8896 dB] [ET 291 ] [L0 6 ] [L1 8 ] !SUMMARY -------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 9 a 565.6533 45.1559 47.5974 53.2634 !!I Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 1 i 27.8400 68.0431 71.3615 99.9900 !!P Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 1 p 3709.6800 39.3042 43.1461 45.9746 !!B Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 7 b 193.3371 42.7223 44.8385 47.6294 ! Total Time: 2558.456 sec.
Next Generation Codecs• Trade increased encoding time and cpu for
decreased bandwidth
• Promise of significant gains in compression
• Reference code and specifications now available
• Still tuning for performance
• Google VP9 developer videos on YouTube
VP9 test sequence• Convert input video to vp8
• Encode to vp9
• 2 pass ! vpxenc --codec=vp9 -t 7 -o APAN_demo_nasa.vp9_2_pass_clang.webm -w 1290 -h 720 --cpu-used=4 -p 2 --target- bitrate=1200 --kf-max-dist=360 APAN_demo_nasa.vp8_1.y4m
Summary• Media preparation
• ffmpeg recipes for media encoding for Web and Archive
• Next Generation Codecs