Download - 新的挑战 - 基于内容的信息处理
![Page 1: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/1.jpg)
新的挑战 - 基于内容的信息处理
张 钹清华信息科学与技术国家实验室智能技术与系统国家重点实验室
清华大学信息学院、计算机系
![Page 2: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/2.jpg)
一、经典信息论Classical Information
Theory
Semantics vs. Information Processing
• Frequently the messages have meaning: that is
they refer to or are correlated according to some
system with certain physical or conceptual entities.
• These semantic aspects of communication
irrelevant to the engineering problem.
C. E. Shannon, A mathematical theory of communication, 1948
![Page 3: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/3.jpg)
经典通讯模型
• Communication Model:
Sender
Receiver
(Markov) Stochastic Process
-C. E. Shannon
( ) ( ) ( )P x P y x P x y
![Page 4: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/4.jpg)
图灵计算理论Turing Computation Theory
Deterministic Turing Machine (1937)
Computability
Algorithms
Computational Complexity
P Read-write Head
tape -4 -3 -2 -1 0 +1
QFinite State Controller
![Page 5: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/5.jpg)
经典信息论下的信息处理Information Processing
The conversion of latent information intomanifest information _C. E. Shannon
X -p(x) Y -
Dissipation Equivocation
Transformation=equivocation-dissipation
Uncertainty (error, noisy, deformation,…)
( )p x y
![Page 6: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/6.jpg)
经典信息论在文本、图像处理中的应用
The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel
• Communication: coding, data
compression, noisy-channel coding,..• Text: editor, compression, spell and syntax correction,…• Image: editor, lossless compression, noise suppression, enhancement, …
![Page 7: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/7.jpg)
二、基于内容(语义)的信息处理
Information (Web) Age
• Complex (variety of ) information:
text, image, speech, video,…
• A huge amount of data
• Man-Machine Interaction
Content (Semantics) based Information Processing
Information Retrieval, Classification, Summary,
Recognition, Understanding,..
![Page 8: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/8.jpg)
Computer
Input
(Encoding)
Output
(Decoding)
Users
Codes Form
Content
从处理形式( Form )到与内容( Semantics)相关的处理
Natural Man-Machine Interaction
Computer Network
![Page 9: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/9.jpg)
三、句法分析( Syntact Analysis ) Holistic Coding (Gestalt)
The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.
数字视频编码技术发展至今已有半个世纪的历史,已取得很大的进展。从五十年代的差分预测编码,到七十年代的变换编码、基于块的运动预测编码,直到如今兴起的分布式编码、立体视编码、多视编码、视觉编码等等
11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100
00001001010100111110101010000011101010110100010001110101000110010110111010010011001000100111100100000111
Semantics S Code (Form) X
?
![Page 10: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/10.jpg)
Rule-based, Knowledge-based, Top-down, Parsing (text)
Advantages: Deliberative behaviors (AI expert systems: decision making, diagnosis, design,…) Specific domain, Small-size
Disadvantages: Perception, Common sense, Nature language,..
Uncertainty (exception, ambiguity, vague,..)
![Page 11: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/11.jpg)
Detector (检测子) Semantically meaningful primitives
Text: word-sentence-chapter-
Image: subpart-part-object-
There is no clear boundary among parts
Segmentation Problem
Descriptor (描述子) Structural uncertaintyK. S. Fu, Syntactic pattern recognition, New York: Prentice-Hall, 1974
D. Marr, Vision, New York: Freeman, 1982
![Page 12: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/12.jpg)
图像分割(分词)
Where is the object ?
What is the object ?
Chicken orEgg ?
![Page 13: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/13.jpg)
结构分析Structural analysis, Rule-based,
Syntax
11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100
Image: Part, Object, … Text: Words, Sentences, …
• Uncertainty problem
• Scalability
Syntactic Analysis Faced Difficulty !
The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.
![Page 14: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/14.jpg)
四、概率统计模型
Image (text) Classification:
Categories
Low level and local features (words)
Computer easily detectable but less
semantically meaningful
Colors, Textures, Bag of words
![Page 15: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/15.jpg)
不确定性处理 Uncertainty Management
• How to deal with uncertainty ?
• A probabilistic information processing model
Regularity:
Probabilistic distribution
Examples:
Caltech 101, 25 objects
R. O. Duda & P. E. Hart, Pattern Classification and Scene Analysis, New York: John Wiley & Sons, 1973
* * * * * * * * * * * * ** * * ** * · · · · ·
··· · ·· ·· ·· · ······· ·· · · · · ·
![Page 16: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/16.jpg)
词袋法 Bag of (Visual) Words
• Defined in image patches (2005-06)
• Descriptors extracted around interest points
(2002-2004)
• Edge contours (2005-06)
• Regions (2005-06)
![Page 17: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/17.jpg)
检测子与描述子 Detector & Descriptor
Kadir salience
region (points)
Histograms of Oriented Gradients (HOG)
-72 dimensionZuo Yuanyuan (2010-)
![Page 18: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/18.jpg)
1
1 arg min( ( , ))1( )
0
n iv V
i
if w D v rCB w
n otherwise
CB-Codebook (Histograms)
n-the number of regions in an image
ri-an image region i
D(w,ri)-the distance between a codeword w
and region ri
V-vocabulary
![Page 19: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/19.jpg)
数据空间的稀疏结构 - Sparsity
![Page 20: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/20.jpg)
高维空间中的低维结构 -low-dimensional structure of high-dimensional space
Objects Precision
Airplane 0.947
Car-side 0.987
Dalmatian 0.733
Faces 0.760
Leopard 0.827
Pagoda 0.760
Stop-sign 0.800
Windsor-chair 0.787 Data Structure
![Page 21: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/21.jpg)
优化yAxtosubjectxx
11 minargˆ
211 minargˆ yAxtosubjectxx
Sparse representation in sample space
L1 norm
![Page 22: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/22.jpg)
图像库
![Page 23: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/23.jpg)
实验结果
![Page 24: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/24.jpg)
数据的空间结构(非稀疏性)
![Page 25: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/25.jpg)
数据驱动法( Data-driven )
-learning from web data
Germany
Speech Recognition
Speech Synthesis
Text1Translation
Service
Text2
English
Learning from Data
(probabilistic models)
Annotation
Microsoft Research Asia
![Page 26: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/26.jpg)
概率方法的基本缺陷• The semantic gap between low-level local features and high-level global concepts
Less semantically meaningful features: colors or their distribution (histogram), gray-values or their distribution, visual words (descriptors from interest points), image patches, image regions, edge, …
• Lack of structural knowledge
Generalization capacity
Information processing without understanding
![Page 27: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/27.jpg)
五、新的研究方向
Sender
Reader
X X
S Uncertainty F (W,D)
![Page 28: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/28.jpg)
句法分析的复苏Holistic (Gestalt), Probability, Inference
Information Structure Analysis
• Syntactic + Probabilistic
• Data-driven + Knowledge-driven
• Bottom-up + Top-down
• Part-based (Shape-based)
![Page 29: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/29.jpg)
信息结构 Information Structure
The basic properties among different quotient spaces such as the falsity preserving, the truth preserving properties are discussed. There are three quotient-space model construction approaches.
数字视频编码技术发展至今已有半个世纪的历史,已取得很大的进展。从五十年代的差分预测编码,到七十年代的变换编码、基于块的运动预测编码,直到如今兴起的分布式编码、立体视编码、多视编码、视觉编码等等
11001001010100100010101010011111000010101000001000111111011110010010001000100000010010101000100111110100
00001001010100111110101010000011101010110100010001110101000110010110111010010011001000100111100100000111
Semantics Code (Holistic)
?
![Page 30: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/30.jpg)
信息结构分析
History (Structural Analysis)(1) LinguisticsInformation Structure-Syntactic representationInformation packaging, Building the semanticsFocus, Topic, background, comment,…
• M. A. K. Halliday (1967) Notes on transitivity and theme in English, Part II, Journal of Linguistics 3:199-244• W. L. Chafe, Language and consciousness, Language 50:111-133
![Page 31: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/31.jpg)
(2) Psychology Structural Information Theory• Coding: descriptive complexities_ frequencies of occurrence (Shannon)• Information Structure: Formalization of visual regularity: iteration, symmetry, and alternation
E. L. J. Leeuwenberg (1968) Structural information of visual patterns: an efficient coding system in perception, The Hague: MoutonE.L. J. Leeuwenberg (1969), Quantitative specification of information in sequential patterns, Psychological Review, 76, 216-220
![Page 32: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/32.jpg)
(3) Information Science Algorithmic Information Theory Kolmogorov Complexity 16 bits 1111111111111111 0101101001011010 1001110111010111 1100010001001011
R. J. Solomonoff (1964), A formal theory of inductive inference,
Information & Control, V.7, No.1, pp1-22, No.2, pp. 224-254
A. N. Kolmogorov (1965), Three approaches to the definition of
the quantity of information, Problems of Information Transmission, No. 1, pp. 3-11
![Page 33: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/33.jpg)
信息结构的挖掘与利用• Kolmogorov complexity
• Related data structures
low-dimensional structure in high-dimensional
data space
• Under the probabilistic (structure) framework
• Learning from human being
![Page 34: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/34.jpg)
(1) Kolmogorov Complexity
min : ( ) ,( )
p f p x if x ran fK x
otherwise
The absolute measure of information content
in an individual finite object
Algorithmic information theory
The minimum description length
![Page 35: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/35.jpg)
信息距离 Information Distance
max ( ), ( )( , )
max ( ), ( )
K x y K y xd x y
K x K y
K(.) is non-computable
C. Bennett , et al, Information distance, IEEE Trans. on Information Theory, vol.44, no.4, 1998, pp.1407-1423
min ( ), ( )( , )
min ( ), ( )
K x y K y xd x y
K x K y
![Page 36: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/36.jpg)
A statistical approach to Kolgomorov complexity
(information distance)
f(x)-frequency
N-total number
( , ) ( , )
log ( , ) max log ( ),log ( )
log min log ( ),log( )
d x y NSD x y
f x y f x f y
N f x y
近似计算方法
![Page 37: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/37.jpg)
语义距离与形式距离 Semantic & Formal Distance
max ( ), ( )( , )
max ( ), ( )
C x y C y xCDM x y
C x C y
CDM-Compressed Distance Measure (ASCII)
NSD-Normalized Statistical Distance
Xian Zhang, Yu Hao, et al, New information distance measure and its application in question answering system, J. Comput. Sci. & Tech. 23(4), 557-572, 2008
![Page 38: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/38.jpg)
两篇文本之间的信息距离
1 1 11 12 1
2 2 21 22 2
, ,...,
, ,...,
n
m
T W w w w
T W w w w
Text representation: a bag of words
max 1 2 1 2 2 1
1 2 1 2
2 1 2 1
( , ) max ( ), ( )
( ) ( \ )
( ) ( \ )
( ) ( ), ( ) log ( )
i ji j
i ji j
w W
D T T K T T K T T
K T T K w w
K T T K w w
K W K w k w f w
![Page 39: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/39.jpg)
文本(图像)检索
1
2
3
...
n
t
t
t
t
Image-Text
information distance
Web. . . .
. . .
.
. . .
.
. . .
.
. . .
.
. . .
.
. . . .
. . .
.
. . .
.
. . .
.
. . .
.
. . .
.
1
2
.
n
i
i
i
![Page 40: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/40.jpg)
(2) 数据结构 -low-dimensional structure of high-dimensional space
Sparse Representation1 1
arg minx x subject to
Ax y or
BAx z
A: mn matrixx: sample space, n-dimensiony: pixel, m-dimensionz: feature space, d-dimension Yi Ma, UIUC, USA
![Page 41: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/41.jpg)
(3) 扁平的结构模型 -Image Region Annotation: horse, sky, mountain, grass, tree
Yuan Jinhui (2008-)
![Page 42: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/42.jpg)
区域自适应的网格划分 Region-adaptive Grid Partition
,,
* arg max ( ) arg max ( ) ( )
1( , ) exp ( ( ), ) ( , )
( , ) ( ) ( )
z z
i i i i j i ji i j
z p z I p I z p z
p I z f I x z g z zZ
p I z p I z p z
I-dataxi-image position
z-state (feature)g(zi,zj)-the probabilistic region-basedconstraints (co-occurrence)
![Page 43: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/43.jpg)
不同的扁平结构模型
![Page 44: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/44.jpg)
模型学习
n images, each image has mi=HV grids
( , ) ( , ), 1,2...,
( , ) ( , ), 1,2,...,
i i
j ji i i i i
x y x y i n
x y x y j m
(a) i.i.d generative model
(b) i.i.d. discriminative model
(c) 2-dimensional hidden Markov (2D HMM)
(d) Markov Random Field (MRF)
(e) Conditional Random Field (CRF)
![Page 45: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/45.jpg)
标注配置
( , ), 1,2,...,i ix y i N
Given a training data,
MAP (maximal a posterior) : label configuration
1: 1:* arg max ( )m my P y x
For 2D HMM, MRF, CRF
using path limited Viterbi algorithm
![Page 46: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/46.jpg)
Probabilistic distribution P Cs: labeling clique, C0: labeling and feature cliquey* the optimal label configuration
0
1: 1:
( , ) ( , )
1( , ) ( , ) ( , )
i j s k k
m m i j k k
y y C y x C
P x y y y y xZ
1:0( , ) ( , )
* arg max ( , ) ( , )m
i j s k k
i j k k
yy y C y x C
y y y y x
马尔科夫随机场模型
- MRF model
![Page 47: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/47.jpg)
结构预测学习算法Learning Rules
Classification
Structural Prediction
Maximal Joint Likelihood Estimation
Naïve Bayesian Network
Hidden Markov Model (1966)1
Maximal Conditional Likelihood Estimation
Logistic Regression
Conditional Random Field (2001)2
Maximal Margin Learning
SVM Maximal Margin Markov Net (2003)3
Maximal Entropy Discrimination Learning
Maximal Entropy Discrimination Model
Maximal Entropy Discrimination Markov Net (2008)4
![Page 48: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/48.jpg)
相关文献[1] L. E. Baum and T. Petrie. Statistical Inference for Probabilistic Functions of Finte State Markov Chains. The Annals of Mathematical Statistics, Vol. 37, No. 6, pp.1554-1563, 1966
[2] J. Lafferty et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. of International Conference on Machine Learning (ICML), 2001
[3] B. Taskar et al., Max-Margin Markov Networks. Advances in Neural Information Processing Systems (NIPS), 2003
[4] J. Zhu et al., Laplace Maximum Margin Markov Networks, In Proc. of International Conference on Machine Learning (ICML), 2008
![Page 49: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/49.jpg)
实验设置• 4002 Corel images (384256 or 256384)
• 11 basic (region) concepts
• Features: color moment + wavelet
• 5 models: 2 without structural knowledge
(GMM, SVM)
3 with structural knowledge
(HMM*, RMF*, CRF*)
![Page 50: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/50.jpg)
图像区域标注类别
![Page 51: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/51.jpg)
不同模型的比较
GMM: Gaussian Mixture Model (30 components)
SVM: Support Vector Machine
Gaussian kennel, one-against-one
HMM: Hidden Markov Model
RMF: Random Markov Field
CRF: Conditional Random Field
Limited Path Viterbi Algorithm
![Page 52: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/52.jpg)
实验结果
![Page 53: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/53.jpg)
(4) Latent Hierarchical Model
Web Page Extraction Name, Image, Price, Description, etc.
Web Page
Data Record
Image
Name Desc Price Note
Desc
Note
Data Record
Image
Desc Desc Name Desc Price Note Note
Given Data Record
Hierarchy Computational
efficiency Long-range
dependency Joint extraction
{image}
{name, price}
{name}
{price}
{name}
{price}
{image}
{name, price}
{desc}
{Head}
{Tail}{Info Block}
{Repeat block}
{Note}
{Note}
Zhu Jun (2008- )
![Page 54: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/54.jpg)
Experiment Web page extraction Name, Photo, Price ,
Description Models Multi-lager CRFs, Multi-layer
M^3N, PoMEN, Partially observed HCRFs
Data set: 37 TemplateTraining: 185 (5/per template)
pages, or 1585 data records
Testing: 370 (10/per template) pages, or 3391 data records
Web Page
Data Record
Image
Name Desc Price Note
Desc
Note
Data Record
Image
Desc Desc Name Desc Price Note Note
![Page 55: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/55.jpg)
Results
23/4/22 博士论文 预答辩 55
Performances: Avg F1:
o avg F1 over all attributes
Block instance accuracy:
o % of records whose Name, Image, and Price are correct
每个属性的性能
![Page 56: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/56.jpg)
(5) 人脑视觉机制
![Page 57: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/57.jpg)
Top-down feedback
Top-down feedback
Local connection
Data-driven V1 V2 … IT
High-level
Prior-knowledge
自底向上与自顶向下
![Page 58: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/58.jpg)
• 114 basic functions
Images 1212 pixels
• HOG
视觉基元 Visual Primitive Elements
![Page 59: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/59.jpg)
稀疏编码( Spare Coding )
( , ) ( , )i ii
I x y a x y2
( , ) ( , ) ii i
xy i i
aE I x y a x y S
( , )I x y
( , )i x y
ia
Images
Basic functions
Coefficients
A non-linear function2log(1 )S x or x
![Page 60: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/60.jpg)
Object Recognitionwith sparse, localizedfeatures
MIT-CSAIL-TR-2006-028 T. Serre
![Page 61: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/61.jpg)
![Page 62: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/62.jpg)
HMAX-sum + max
![Page 63: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/63.jpg)
多学科交叉研究Center for Cognitive and Neural Computation
Tsinghua University, Beijing
• Computational Neuroscience
• System Neuroscience
• Intelligent Technology and Systems
• Neural Information and Brain-computer interface
• Learning and Memory
• Cognitive Psychology
![Page 64: 新的挑战 - 基于内容的信息处理](https://reader033.vdocuments.site/reader033/viewer/2022061401/56815a22550346895dc76748/html5/thumbnails/64.jpg)
谢谢!