look over here: attention-directing composition of manga elements ying cao rynson w.h. lau antoni b....
TRANSCRIPT
Look Over Here: Attention-Directing Composition of Manga ElementsYing CaoRynson W.H. LauAntoni B. Chan
SIGGRAPH 20141
Outline• Introduction • Overview• Data Acquisition and Preprocessing • Probabilistic Graphical Model• Learning• Interactive Composition Synthesis• Evaluation and Results • Discussion
2
Introduction • Goal
3
1.Rabbit, I came here for gold,
2. and I'm gonna get it!
3. I gotcha, you rabbit! I'll show you!
Close-upFast
LongMedium
Close-upMedium
Big Close-upMedium
MediumMedium
You can't do this to me!
Eureka! Gold at last!
Eureka! Gold at last!
1
2
3
4
5
Talk
Introduction• The especially composition of manga elements . subjects ( ) and balloons( )
• Manga artist guides viewer’s eyes through the page via subject and balloon placement.
• The path guiding the readers through the artworks the underlying artist’s guiding path (AGP) • The viewer’s eye-gaze path through the page the actual viewer attention
4
Introduction• We introduce a novel probabilistic graphical model for
subject-balloon composition.
• Based on this model, we propose an approach for placing a set of subjects and their balloons on a page.
• In response to high-level user specification, and evaluate its effectiveness through a series of visual perception studies.
5
Overview
6Probabilistic Graphical Model
𝐆 𝐂 𝐀Artist’s Guiding Path
Composition
Viewer Attention
Input Storyboard Layout
Generate
Resulting composition
DataAnnotation Eye-tracking Data
LearnInput Infer
Data Acquisition and Preprocessing • To train our probabilistic model, we have collected a data set
comprising 80 manga pages from three different series.
7
Shot type→Motion state→
Balloons→
↓Subject
Annotation Eye movements of viewers
Probabilistic Graphical Model• We propose a novel probabilistic graphical model to
hierarchically connect artist’s guiding path, composition and viewer attention in a probabilistic network.
• Abstracts the artist’s guiding path (AGP) as a latent variable in our model.
8
Probabilistic Graphical Model
𝐆 𝐂 𝐀Artist’s Guiding Path
Composition
Viewer Attention
Probabilistic Graphical Model• Our proposed model consists of 6 components, representing
different factors that influence the placement of elements on the page.
9
(1)-Model Components and Variables
• In our model, the page consists of a set of panels.
• Each panel has subjects, each of which has balloons.
10
(1)-Model Components and Variables • Artist’s Guiding Path(AGP)Underlying AGP (f(t)) and actual AGP (I(t)) are represented as
smooth splines over the page.
Uniformly samples control points along the curve length,
11
actual AGP:
underlying AGP:
(1)-Model Components and Variables • Panel Properties and Local Composition ModelWe consider both semantic (i.e., shot type and motion state)
and geometric (i.e., rough shape) properties of the panels.
12
eometric style 1 = 1, geometric style 2 = 2, geometric style 3 = 3}
long = 1, medium = 2, close-up = 3, big close-up = 4
slow = 1, medium = 2, fast = 3
(1)-Model Components and Variables • Panel Properties and Local Composition ModelWe define as the possible subject locations and
sizes according to the local composition in the panel
13
(1)-Model Components and Variables • Subject PlacementThe actual placement of a subject is a mixture of its local
position and an associated point on the global AGP.We denote the subject’s location and size as .
14
(1)-Model Components and Variables • Balloon PlacementThe placement of a balloon depends on its subject’s configuration
, its size , and reader order , as well as an associated point on the AGP. We denote the balloon’s position and size as .
15
(1)-Model Components and Variables • Viewer Attention TransitionsFor each panel, we define a set of binary variables ,
where indicates that there is a viewer transition between elements and .
16
(1)-Model Components and Variables • Complete model by putting the six model components
together.
17
(2)- Probability Distributions
• Each random variable in our model is associated with a conditional probability distribution (CPD), , which represents the probability of observing given its parents
.
• We next describe the CPDs used for each variable in our model.
18
(2)- Probability Distributions• Artist’s Guiding Path (f, I).The two coordinate components of the curve are modeled as
two independent Gaussian processes,
- , : the squared exponential covariance functions
The actual AGP I is a noisy version of the underlying AGP f,
- denotes a multivariate Gaussian distribution of x, with mean µ and covariance Σ.
19
(2)- Probability Distributions• Panel Properties (P).The shot type t, motion state m and geometric style g are all
discrete random variables with categorical distributions,
• Local Composition (, ).To describe the complexities of local foreground placement ,
we use a Gaussian mixture model (GMM),
The local subject size is Gaussian, .
20
(2)- Probability Distributions• Subjects and Balloons (S, B).Let be the continuous parent variables of . For
the subject S, we have
Similarly, let be the continuous parent variables of . For the balloon B, we have
- For subject size , we define , with ω and being weight parameter and variance.
21
(2)- Probability Distributions• Viewer Attention Transitions (U = {}).Let be a set of parent random variables of . We define the
CPD of as
- We define .
The potential function is a linear combination of two terms,
22
Learning
• The goal of the offline learning stage is to estimate the parameters θ in the CPDs of all random variables in the probabilistic model, from the training set D.
expectation-maximization (EM) algorithm [Bishop 2006]
23
BISHOP, C. 2006. Pattern Recognition and Machine Learning. Springer.
Interactive Composition Synthesis• Generate a composition, subject to user-specified semantics
• Layout Generation + Composition Synthesis
24
1.Rabbit, I came here for gold,
2. and I'm gonna get it!
3. I gotcha, you rabbit! I'll show you!
Input:
subject & script
Close-up
Fast
shot type & motion state
Talk
inter-subject constraint
(1)-Layout Generation• We use a simple search algorithm to retrieve the best-fitting
layout from our database of labeled pages.
for i-th panel of the input and layout candidate- : shot type - : motion state- : the number of elements
25
(2)-Composition via MAP Inference
• The objective of MAP(Maximum A Posteriori) is to find a solution to that maximizes the posterior probability,
26
Configurations of elements
Input elements & semantics + Layout
𝐘𝐶Constraints
(2)-Composition via MAP InferenceConstraint-based Likelihood.
-where {ρi} are weights controlling importance of different terms. -Our implementation uses ρ1 = ρ2 = 0.3, ρ3 = ρ4 = 0.2.
27
(2)-Composition via MAP InferenceConstraint-based Likelihood.
29
: boundary term
subject relation term
𝑟 𝑖 𝑟 𝑗 𝐯 ij
(1)-Comparison to Heuristic Method • Visual Perception Study.The goal of the visual perception study is to investigate if the
participants have a strong preference for our results over those produced by the heuristic methodt[Chun et al. 2006].
31
CHUN, B., RYU, D., HWANG, W., AND CHO, H. 2006. An automated procedure for word balloon placement in cinema comics. LNCS 4292, 576–585..
(1)-Comparison to Heuristic Method • Eye-tracking experiment and analysis.We measure the consistency in both unordered and ordered
eye fixations across different viewers. Inlier percent [Judd et al. 2009] Root Mean Squared Distance (RMSD)
33
JUDD, T., EHINGER, K., DURAND, F., AND TORRALBA, A. 2009. Learning to predict where humans look. In ICCV’09.
InliersViewer A Saliency Map
Viewer B
Classification
RMSD,
Viewer A Viewer B
(1)-Comparison to Heuristic Method • Eye-tracking experiment and analysis. Shows example compositions with eye-tracking data.
34
(5)-Limitations • Our work has two limitations.1. Our work assumes that the variations in spatial location and
scale of elements are the only factors driving viewer attention.
2. For the panel with more than four subjects, our approach can fail to produce satisfying results automatically.
38
Discussion • We have proposed a probabilistic graphical model for
representing dependency among the artist’s guiding path, composition and viewer attention.
• We show that compositions from our approach are more visually appealing and provide a smoother reading experience, as compared to those by a heuristic method.
• Enable easy and quick creation of attention-directing compositions.
• Extend to other graphic design tasks. 39