ece 5582 computer vision - sce.umkc.edu · ieee computer 1995 special issue on content based image...
TRANSCRIPT
Spring 2019: Venu: Haag 315, Time: M/W 4-5:15pm
ECE 5582 Computer VisionLec 01: Introduction
Zhu LiDept of CSEE, UMKC
Office: FH560E, Email: [email protected], Ph: x 2346.http://l.web.umkc.edu/lizhu
Z. Li: ECE 5582 Computer Vision, 2019. p.1
slides created with WPS Office Linux and EqualX LaTex equation editor
Outline
Background Objective of the class Prerequisite Lecture Plan Course Project Q&A
Z. Li: ECE 5582 Computer Vision, 2019. p.2
An image is worth a thousand words….
What we observe are pixels…. The story: The train wreck at La Gare
Montparnasse, 1895
What computer can do these days: Figure out the building The train People walking around
Still long way to go to figure out the semantics Train crashes It is an abnormal event (context)
Z. Li: ECE 5582 Computer Vision, 2019. p.3
La Gare Montparnasse, 1895
Advances in Image Sensors: pixels and voxels
p.4Z. Li: ECE 5582 Computer Vision, 2019.
More than 20 years of Image Retrieval Research…
IEEE Computer 1995 Special Issue on Content Based Image Retrieval (CBIR)
Z. Li: ECE 5582 Computer Vision, 2019. p.5
Dr. Raghavan, VijayDistinguished ProfessorCenter for Adv. Computer StudiesUniv of Louisiana at Lafayette
NSF Digital Librararies Initiative Relevance Feedback in CBIR
Z. Li: ECE 5582 Computer Vision, 2019. p.6
MPEG-7 Visual Features (Circa 2003)
Color, Shape, Texture Features for Image Search
Z. Li: ECE 5582 Computer Vision, 2019. p.7
Color Texture MotionShape
• Contour Shape
• Region Shape
• Camera motion
• Motion Trajectory
• Parametric motion
• Motion Activity
• Texture Browsing
• Homogeneous texture
• Edge Histogram
1. Histogram
• Scalable Color
• Color Structure
• GOF/GOP
2. Dominant Color
3. Color Layout
ImageNet - Deep Learning Classification
Tasks: Image Classification, Object Detection & Localization 2012: Fisher Vector 2013: Deep Learning ~ Conv Neural Networks (CNN) .e.g. AlexNet 2016: (Very) Deep Learning ~ Residual Neural Networks (ResNet),
K. He, MSRA/FAIR.
Z. Li: ECE 5582 Computer Vision, 2019. p.8
MPEG CDVS - Identification Compact Descriptor for Visual Search (CDVS) Object Re-Identification Applications: Navigation, Query by Capture, AR/VR
Technology: Key Point (SIFT) detection Fisher Vector Aggregation and Hashing (for shortlisting) SIFT compression
Performance Verification: 90+% precision on 1% recall Retrieval : mAP in 80~90%.
Z. Li: ECE 5582 Computer Vision, 2019. p.9
Depth Sensing/SLAM
Key problems for auto driving cars• Depth from Stereo Images• Optical Flow• Scene Flow• 2D/3D data fusion and registration• Image/3D features for SLAM • Higher level syntactic object/event
recognition
Z. Li: ECE 5582 Computer Vision, 2019. p.10
Image Analysis Pipeline Handcrafted Feature Based
Z. Li: ECE 5582 Computer Vision, 2019. p.11
Pixels Features Aggregation/Dimension Reduction
Classifier
{Cat, Dog}
Image Analysis Pipeline Holistic Image Analysis Direction Pixel Projection Subspace Models
Convolutional Neural Networks
Z. Li: ECE 5582 Computer Vision, 2019. p.12
h
w
X in Rhxw
Y=AX
Outline
Background Objective of the class Prerequisite Lecture Plan Course Project Q&A
Z. Li: ECE 5582 Computer Vision, 2019. p.13
Prerequisite & Text book Prerequisite
For senior and graduate students in EE/CS Good Matlab/C programming skills. Some Python is
also desirable. Taken Signal & System, or Digital Signal Processing or
consent of the instructor Will have different expectation for MS/PhD and
undergrad students Textbook:
None required (saving $$) , will distribute relevant chapters, papers, and notes.
Key References: R. Szeliski, Computer Vision: Algorithms and
Applications, Springer, 2014. URL: http://szeliski.org/Book/
J. E. Solem, Programming Computer Vision with Python, O’Reilly, 2015. URL: http://programmingcomputervision.com/downloads/ProgrammingComputerVision_CCdraft.pdf
Z. Li: ECE 5582 Computer Vision, 2019. p.14
Tentative Lecture Plan Image Processing Basics
Camera model and image formation Image filtering
Image Features for Retrieval Color Features Texture and Shape Features Basic Image Retrieval System and
Metrics Object Identification in Image
Key Point Detection Key Point Feature Description Fisher Vector Aggregation MPEG Mobile Visual Search
Technology and Standard Holistic Approach in Image
Understanding Subspace methods for face recognition:
Eigenface, Fisherface, Laplacianface. Deep Learning in Image Understanding
Z. Li: ECE 5582 Computer Vision, 2019. p.15
Homework 1: Image Filtering and Features
Homework 2: Image Retrieval System
Homework 3:SIFT/VGG Feature Aggregation
Homework 4:Subspace and deep learningmethods in face recognition
Potential Course/MS thesis Project Resources from last year: http://sce2.umkc.edu/csee/lizhu/teaching/2017.spring.image-
analysis/main.html
Potential projects with 25% bonus points Boosting based key point feature detection via box filtering for
Hyperspectral images Compact deep learning models for embedded vision Image + Point Cloud based recognition Image registration with point cloud Image Compression for vision tasks
Z. Li: ECE 5582 Computer Vision, 2019. p.16
Embedded Deep Learning for Image Understanding
• Automatic Image Tagging:• Mapping image pixel values to tags• Device based recognition:
• Distilling the knowledge into a compact form• Compact CNN model for knowledge distilling
– BIGDATA training set: >10k images per tag– training set augmentation:
» affine transforms, simulated lighting changes– Pruning filters from the CNN structure
p.17Z. Li: ECE 5582 Computer Vision, 2019.
Immersive Visual Communication
Light Field Compression (6-DoF) Sub-Aperture Image Based Tensorial Display Decomposition Based
Point Cloud Compression (6-DoF) Video Based: Views + Depth Binary Tree/Octree Geometry Coding + Graph Signal Compression
360 Video Compression (3-DoF) Advanced Motion Model: Affine motion, Spherical motion models Padding
p.18Z. Li: ECE 5582 Computer Vision, 2019.
Deep Learning in Video Compression Deep Learning Chroma prediction (from Luma)
Results: BD rate:
p.19
C1:128@16×16
ConvolutionKernel: 5×5 Convolution
Kernel: 3×3,5×5
C2: (96+32)@16×16
C3: (96+32)@16×16C4: 2@16×16
ConvolutionKernel: 1×1,3×3
ConvolutionKernel: 3×3
F1: 256@1×1F2: 196@1×1
F3: 128@1×1
Tile: 128@16×16
Fusion: 128@16×16
Full Connection Full Connection
Full Connection Tiling
Element wise Product
Input: 1@16×16
Input: 99@1×1
Down-sampled Reconstructed
Luma
Neighboring Reconstructed
Luma & Chroma
Predicted Chroma
deep chroma predicted blocks
Z. Li: ECE 5582 Computer Vision, 2019.
Deep Learning in IBC prediction
IBC with deep learning IBC matching block fuse with intra ref
column/row pixels
Performance :
p.20
Overall, 1.3%, 1.4%, 1.6% bits saving for Y, U, V component, respectively.
Peak performance on Luma component on BQTerrace achieves 3.6% BD-Rate reduction.
Z. Li: ECE 5582 Computer Vision, 2019.
Low Light Image Denoising & Understanding
Use cases Low light imaging Low light image understanding
Deep Scaling & Residual learning
p.21Z. Li: ECE 5582 Computer Vision, 2019.
Results
Low light image denoising
p.22
SID Unet
Ours
Z. Li: ECE 5582 Computer Vision, 2019.
Course Outcome
Upon completion of the course you will be able to: Understand the basic operations in image formation and filtering Understand basic image features for retrieval: color, shape, texture Understand key point features and aggregation in object
identification Understand the holistic appearance modeling approach in image
understanding Understand the latest image analysis and understanding techniques
like deep learning . Can apply the knowledge an algorithms to solve real world image
understanding and retrieval problems Well prepared for conducting advanced research and pursing
career/PhD in this topic area. (PhD qualify required course)
Z. Li: ECE 5582 Computer Vision, 2019. p.23
Grading
Homeworks (40%) Image Filtering and Basic Features Image Retrieval System and Performance Metrics Key Point Feature and Fisher Vector Aggregation in Object
Identification Subspace Models in Image Understanding
2 Quizzes (20%) : relax, quiz is actually on me, to see where you guys stand Quiz-1: Part I and II Quiz-2: Part III and IV
Project (40%) Original work leads to publication, discuss with me by the mid of
October. (25% bonus point, worth 50 points) Regular project: assign papers to read, implement certain aspect, and
do a presentation.
Z. Li: ECE 5582 Computer Vision, 2019. p.24
Logistics
Office Hour: Tue/Thr 2:30-4:00pm, 560E FH Or by appointment
TA: TBD Lab Sessions are planned to cover certain software tools aspects. Office Hour: TBA
Course Resources: Will share a box.com folder with slides, references, data set, and
software Additional reference, software, and data set will be announced.
Z. Li: ECE 5582 Computer Vision, 2019. p.25
NSF Center for Big Learning
Short Bio:
Research Interests: Immersive visual communicaiton: light field, point
cloud and 360 video coding and low latency streaming
Low Light, Res and Quality Image Understanding What DL can do for compression (intra, ibc, sr, inter,
end2end) What compression can do for DL (compression,
acceleration)
signal processing and learning
image understanding visual communication mobile edge computing & communication
p.26
Multimedia Computing & Communication LabUniv of Missouri, Kansas City
Z. Li: ECE 5582 Computer Vision, 2019.
Q&A
Q&A
Z. Li: ECE 5582 Computer Vision, 2019. p.27