submodular maximization with cardinality constraints moran feldman based on submodular maximization...
TRANSCRIPT
Submodular Maximization with Cardinality Constraints
Moran Feldman
Based OnSubmodular Maximization with Cardinality Constraints. Niv Buchbinder, Moran Feldman, Joseph (Seffi) Naor and Roy Schwartz, SODA 2014 (to appear).
2
Set FunctionsDefinitionGiven a ground set N, a set function f : 2N R assigns a number to every subset of the ground set.
Intuition• Consider a player participating in an auction on a set N of elements.• The utility of the player from buying a subset N’ N of elements is
given by a set function f.
Basic Properties of Set Functions• Non negativity – the utility from every subset of elements is non-
negative.
• Monotonicity - More elements cannot give less utility.
f(B)f(A) NBA :
f(A) NA 0:
3
Submodularity - DefinitionIntuition• Captures scenarios where elements can replace each other,
but never complement each other.• The marginal contribution of an element to a set decreases as
more elements are added to the set.
NotationGiven a set A, and an elements u, fu(A) is the marginal contribution of u to A:
Formal Definition
AfuAfAfu
For sets A B N, and u B:fu(A) fu(B)
For sets A, B N:f(A) + f(B) f(A B) + f(A B)
4
Submodular Function - Example
0567
10811
0
Too heavy
• Non-negative• Nonmonotone• Submodular
54-8
5
Where can One Find Submodular Set Functions?
In Combinatorial Settings
In Applicative Settings• Utility/cost functions in economics (economy of
scale).• Influence of a set of users in a social network.
Ground Set Submodular FunctionNodes of a graph The number of edges leaving a
set of nodes.Collection of sets The number of elements in the
union of a sub-collection.
6
Maximization Subject to a Cardinality Constraint
InstanceA non-negative submodular function f : 2N R+ and an integer k.
ObjectiveFind a subset S N of size at most k maximizing f(S).
Accessing the Function• A representation of f can be exponential in the size of the ground set.• The algorithm has access to f via an oracle.• Value Oracle – given a set S returns f(S).
Algorithmic Evaluation Criteria• Approximation ratio.• Oracle queries.• Time complexity - ignored in this talk.
1 23
7
The (Classical) Greedy Algorithm
The AlgorithmDo k iterations. In each iteration pick the element with the maximum marginal contribution.
More Formally1. Let S0 .
2. For i = 1 to k do:3. Let ui be the element maximizing: fui
(Si-1).
4. Let Si Si-1 {ui}.
5. Return Sk.
8
Results for Monotone Functions
Greedy Achieves• 1-1/e approximation [Nemhauser et al. 78].• Match a hardness of [Nemhauser et al. 78]• O(nk) – oracle queries.• For other constraints:
• ½-approximation for a general matroid constraint. [Nemhauser et al. 78]• (k+1)-1-approximation for k-set systems. [Nemhauser et al. 78] (presented formally by
[Calinescu et al. 11]).
Reducing the Number of Oracle Calls• O(nk) oracle queries is very good compared to tight algorithms for
more involved constraints.• A new result gives 1 – 1/e – ε approximation using O(nε-1log (n / ε))
oracle queries. [Ashwinkumar and Vondrak 14]• The number of oracle queries can be further reduced to O(n log(ε-1)).
1 23
9
What About Non-monotone Functions?
Approximation Ratio• 0.325 approximation via simulated annealing [Oveis Gharan and Vondrak
11]• 1/e – o(1) approximation (measured continuous greedy) [Feldman et al.
11]• 0.491 hardness [Oveis Gharan and Vondrak 11]
Oracle Queries• Both algorithms require many oracle queries.• The greedy algorithm requires few oracle queries, but guarantees no
constant approximation ratio.– Example:
– The greedy algorithm will select v in the first iteration.
1 23
otherwise0
2and,,,, 21 vS
SvS
SfuuuvN n
10
The Random Greedy Algorithm
The AlgorithmDo k iterations. In each iteration pick at random one element out of the k with the largest marginal contributions.
More Formally1. Let S0 .
2. For i = 1 to k do:3. Let Mi be set of k the elements maximizing: fu
(Si-1).
4. Let ui be a uniformly random element from Mi.
5. Let Si Si-1 {ui}.
6. Return Sk.
11
Warm Up: Analysis for Monotone Functions
In iteration i• Fix everything that happened before iteration i. All the expectations will
be conditioned on the history.• By submodularity and monotonicity:
• The elememt ui is picked at random from Mi, and OPT is a potential candidate to be Mi.
• Unfix history - if it holds for every given history, it holds in general too.
1111
iiiOPTu
iu SfEOPTfSfESOPTfESfE
k
SfEOPTf
SfEk
SfEk
SfE
i
OPTuiu
Muiuiu
i
i
)(
11
1
111
1ii SfESfE
12
Warm Up: Analysis for Monotone Functions (cont.)
Adding up all iterationsRearranging:
Combining:
Rearranging again:
We get a set with a value of (1 – 1/e) ∙ f(OPT) in expectation (unlike in the classical greedy).
1
11
ii SfEOPTf
kSfEOPTf
OPTfk
SfOPTfk
SfOPTfkk
k
11
11 0
OPTfe
OPTfk
Sfk
k
11
111
13
Reduction for Non-monotone Functions• We add k dummy elements of value 0.• The dummy elements are removed at the end.• Allows us to assume OPT is of size exactly k.
Analysis for Non-monotone FunctionsHelper LemmaFor a submodular function g : 2N R+ and a random set R containing every element with probability at most p: E[g(R)] ≥ (1 – p) ∙ g().
• Similar to a Lemma from [Feige et al. 2007].• Will be proved later.
Current Objective• Lower bound E[f(OPT Si)].
• Method - show that no element belongs to Si with a large probability, and then apply the above lemma.
14
Analysis for Non-monotone Functions (cont.)
ObservationIn every iteration i, every element outside of Si-1 has a probability of at most 1/k to get into Si.
CorollaryAn element belongs to Si with probability at most 1 – (1-1/k)i.
Applying the Helper Lemma• Let g(S) = f(OPT S).• Observe that g(S) is non-negative and submodular.• E[f(OPT Si)] = E[g(Si)] ≥ (1-1/k)i ∙ g() = (1-1/k)i ∙ f(OPT).
Next StepRepeat the analysis of the classical greedy algorithm, and use the above bound instead of monotonicity.
15
Analysis for Non-monotone Functions (cont.)
In iteration i• Fix everything that happened before iteration i. All the expectations
will be conditioned on the history.• By submodularity:
• The elememt ui is picked at random from Mi, and OPT is a potential candidate to be Mi.
111
iiOPTu
iu SfESOPTfESfE
k
SfESOPTfE
SfEk
SfEk
SfE
ii
OPTuiu
Muiuiu
i
i
)(
11
11
111
1ii SfESfE
16
Analysis for Non-monotone Functions (cont.)
• Unfixing history, and using previous observations, we get:
Adding up all iterations• We got a lower bound on the (expected) improvement in each iteration.• Using induction it is possible to prove that:
Remarks• This algorithm both uses less oracle calls than the previous ones, and
gets ride of the o(1) in the approximation ratio.• Now it all boils down to proving the helper lemma.
k
SfEOPTfkSfESfE i
i
ii
)(/11 11
OPTfkk
iSfE
i
i
11
1 OPTfeOPTfkk
kSfE
k
k
1
11
1
17
Proof of the Helper LemmaHelper Lemma - ReminderGiven a submodular function g : 2N R+ and a random set R containing every element with probability at most p. Then, E[g(R)] ≥ (1 – p) ∙ g().
Intuition• Adding all the elements can reduce the value of g() by at most g() to 0.• Adding at most a p fraction of every element, should reduce g () by no more than p
∙ g().
Notation• Order the elements of N in an order u1, u2, …, un of non-increasing probability to
belong to R.• Let Ni be the set of the first i elements in the above order.
• Let pi = Pr[ui R].
• Let Xi be an indicator for the event ui R. Notice that E[Xi] = pi.
F1
18
Proof of the Helper Lemma (cont.)
The value of the set R can be represented using the following telescopic sum:
Taking an expectation over both sides, we get:
k
iii RNgRNggRg
11
gpNgpppNggp
NgNgpgNgNgXEg
RNgRNgXEg
RNgRNgEgRgE
nii
k
ii
k
iiii
k
iiii
k
iiii
k
iii
11
1
11
11
11
11
11
11
F1
19
Playing with the Size of Mi
Question• The size of Mi determines the guarantee we have on E[f(Si
OPT)].• The larger Mi - the better the guarantee.
• Why not increase |Mi| to be larger than k?
Answer• We know there are k good elements (in average) – the
elements of OPT.• Increasing Mi might introduce into it useless elements.• The gain in every single iteration might decrease.
20
And yet…The Bad Case• Let Mi
k be the set of the k elements with best marginal values at iteration i.
• There are no useful elements outside of Mik.
• Most of OPT’s value is contributed by OPT Mik.
• The best subset of Mik is:
– Feasible.– Has a lot of value.– Can be (approximately) found using an algorithm for unconstrained submodular
maximization.
Taking Advantage• Apply the fast algorithm with Mi larger than k.
• At every iteration, find the best subset of Mik.
• Output the best set seen.
21
And yet… (cont.)
• By making the size of Mi a function of i, one can get e-1 + ε for some small constant ε > 0.
• Using a few more tricks, one can improve ε to 0.004.
Implications• Very small improvement in approximation ratio at the cost
of many more oracle queries.• The ratio e-1 is not the right ratio for a cardinality
constraint.– No candidate for the right ratio.– e-1 is the state of the art for a general matroid constraint. Is it
right for that case?
22
Equality Cardinality Constraint
New ObjectiveFind a subset S N of size exactly k maximizing f(S).
Monotone FunctionsNot interesting. We can always add arbitrary elements to the output.
Non-monotone Functions• Best previous approximation: ¼ - o(1).• Modifications to our algorithm:
– Apply a reduction that let us assume k n/2.– Avoid the reduction described previously.– Select only elements of N \ Si-1 into Mi.
• Achieves:– Approximation of: where v = n/k – 1.– Uses O(nk) oracle queries.– The term ok(1) can be replaced with ε at the cost of a multiplicative constant increase
in the number of oracle queries.
1 23
12/1erfiv/22/11 kv
oe
v
23
Understanding the Approximation Ratio
• The interesting range is: 1 v (k n/2).• erfi is the imaginary error function:
• The approximation ratio as a function of v:
z y dyez0
22erfi
24
Reduction
AimWe want to assume k n/2.
Observations• Equivalent problem - find a subset of size exactly n-k maximizing
h(S) = f(N \ S).• h(S) is non-negative and submodular if and only if f has these
properties.
CorollaryIf k > n/2, we can switch to the above equivalent problem.
BAhBAhBAfBAf
BAfBAfBfAfBhAh
25
Analysis Intuition• A possible candidate for Mi is OPT \ Si padded with random
elements of N \ (OPT Si).• The padding elements can reduce the value of the solution.• However:
– The expected number of padding elements in iteration i is only: k(1 – (1 – 1/k)i) (and k is small compared to n because of the reduction).
– Adding all the elements of N \ (OPT Si) reduces the value to 0 (at the worst case).
– Thus, an average element of N \ (OPT Si) reduces the value by a factor of at most 1 / |N \ (OPT Si)|.
OPT \ Si Padding
k
26
Other ResultsCardinality Constraint• For both problems we consider, an approximation ratio of:
– For k = n/2, both problems have an approximation ratio of ½.– For an equality constraint: 0.356-approximation by balancing this ratio with the
one presented before.
Fast Algorithms for General Matroid Constraint
• State of the art approximation ratio for a general matroid constraint: e-1 – o(1).
1
21
kkn
n
Approximation Ratio Oracle Queries Time Complexity
1/4 O(nk) O(nk)
(1-e2) / 2 – ε > 0.283 O(nk + k3) O(nk + kω+1)
27
Open Problems• Cardinality Constraint
– The approximability depends on k/n.• For k/n = 0.5, we have 0.5 approximation.• For small k’s, one cannot beat 0.491 [Oveis Gharan and Vondrak 11]
– What is the correct approximation ratio for a given k/n?
• Fast Algorithms– Finding fast algorithms for more involved constraints.
• Such as a general matroid constraint.
– Beating e-1 using a fast algorithm:• Even for large k/n values.
– Further reducing the number of oracle quires necessary to get 1-1/e-ε approximation.• No lower bounds on the number of necessary oracle queries.
28