from competition to complementarity: comparative influence diffusion and maximization

35
Comparative Influence Maximization: From Competition to Complementarity Wei Lu (LinkedIn) Wei Chen (Microsoft Research) Laks V.S. Lakshmanan (UBC) NDA’16 Workshop, SIGMOD To appear in VLDB’16, New Delhi, India

Upload: wei-lu

Post on 18-Feb-2017

80 views

Category:

Science


1 download

TRANSCRIPT

Page 1: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Comparative Influence Maximization:

From Competition to Complementarity

Wei Lu (LinkedIn)Wei Chen (Microsoft Research)Laks V.S. Lakshmanan (UBC)

NDA’16 Workshop, SIGMODTo appear in VLDB’16, New Delhi, India

Page 2: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Social influence• Ubiquitous in life• Fueled by the widespread popularity of

online social networks and social media• Computational Social Influence (CSI)– Viral Marketing– Influence Maximization– The applications and extensions to the above

Page 3: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Computational Social Influence• Social networks with edge weights (influence

probabilities or weights)• Stochastic influence/information propagation models

– Single-item vs. Multiple-item models• Diffusion dynamics depend heavily on the

relationship of the propagating entities• Pure Competition: Each user adopts at most one item

– Competitive Independent Cascade Model (CIC)– K-LT Model– WPCLT Model …

Page 4: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Limitations of Pure Competition Models: Example

Page 5: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Item Relationships• Propagating items can be of any relationship:

– Compete (iPhone vs Nexus)– Complement (iPhone vs Apple Watch, iPhone vs

iPhone cases)• Natural and well-studied in economics

– Substitute goods and complementary goods• Item relationship may be asymmetric• Item relationship may be to an arbitrary

degree (not “pure”)

Page 6: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Motivations and Challenges“One model that works for all kinds of item relationships”: Not existent until this workChallenges:• Unified model with great expressive power• Compact and manageable representation• Allows room to develop tractable solutions

for natural influence optimization problems• Model validation, data

Page 7: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Main Contributions• Comparative Independent Cascade

(ComIC): Capturing both competition and complementarity, to any arbitrary degree

• Problem: Self Influence Maximization• Problem: Complementary Influence

Maximization• Algorithm: Generalized Reverse Reachable

Sets• Algorithm: Sandwich Approximation

Page 8: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Model Overview• Focusing on two items

– Challenges abundant already– Future work: extended to an arbitrary number of

items• Edge-level influence/information propagation

– Similar to the classic IC model• Node-level Decision-making controlled by

Node-Level Automata (NLA)– Global Adoption Probabilities (GAP)

Page 9: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Global Adoption Probabilities• Key parameters measuring the degree to which

two items compete with or complement each other

• q(A|0): probability of adopting A when the user has not yet adopted any other items

• q(A|B): probability of adopting A when the user has already adopted B

• q(A|0) >= q(A|B): B competes with A• q(A|0) <= q(A|B): B complements A

Page 10: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Transition diagram

For each item, each node may be of the following status:• Idle (inactive)• Informed (influenced)• Suspended / Adopted / Rejected

Page 11: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Diffusion dynamics• Initially,every node is inactive/idle wrt both items• When any node adopts the first item, its

outgoing edges are tested for information propagation to neighbors (“info channel”)– Each edge (u,v) becomes open w.p.p(u,v)

• If u is A-adopted, and info channel on edge (u,v) is open, then v decides to adopt A based on:– w.p. q(A|0) if v has not adopted B– w.p. q(A|B) if v has adopted B

Page 12: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Node tie-breaking• What if there are multiple in-neighbors active

in the last time step t-1?• Generate a random permutation of those in-

neighbors, and follow that order to test activation

• If one such neighbor adopted both items at t-1, following the same order for informing• If a seed is targeted with both items, decide

the order randomly (0.5 and 0.5 prob.)

Page 13: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Node Reconsideration• Suppose B complements A: q(A|0) <= q(A|B)• User v was informed of A, but did not

adopt with probability 1 – q(A|0)• Once v adopts B, since B complements A,

user may want to revisit the decision with a reconsideration probability:

Page 14: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

General Properties of ComIC model

• Neither submodularity nor monotonicity holds in an arbitrary instance of the model

• Influence maximization may be intractable• Overall strategy:

– Identify a parameter subspace such that submodularity is satisfied

– Develop efficient approximation algorithm (Generalized RR-set) for submodular cases

– “Sandwich Approximation” for non-submodular cases

Page 15: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Submodularity: Complementary Case

Page 16: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Possible World Definition• An equivalent representation of the model

and the propagation dynamics– Propagation in a possible world is deterministic,

easy to reason about• Equivalent Possible World model for ComIC– For each edge (u,v), remove w.p. 1-p(u,v)

– For each node v, randomly generate α(v,A) and α(v,B) for testing with adoption probabilities.

– Adoption happens when α <= adoption prob.

Page 17: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Influence Maximization Problems

• Self Influence Maximization (SIM): Fix B-seed set, find the best A-seed set of size k to maximize A’s expected influence spread

• Complementary Influence Maximization (CIM): Fix A-seed set, find the best B-seed set of size k to maximize the boost B gives to A’s expected influence spread

• Both NP-hard under ComIC model

Page 18: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Algorithm Design for SIM and CIM

• Generalized Reverse-Reachable Set (RR-set): RR-set based algorithms are the state-of-the-art for classical influence maximization with single-item propagation models (IC and LT)

• Sandwich Approximation to achieve approximation guarantees in non-submodular cases

• Both techniques are generic and applicable to any non-submodular maximization problems

Page 19: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Recap: Reverse-Reachable Set• If u can reach v (in a deterministic directed

graph), then u is in a RR-set rooted at v [Borgs et al., SODA’14]

• Random RR-set: root v is randomly chosen• Two-phase Inf. Max. (TIM) [Tang et al 2014]

– Estimate the minimum number of random RR-sets required, for probabilistic approx. guarantees• 1-1/e-ε: smaller ε requires more RR-sets to be generated

– Generate random RR-sets using backward BFS– Seed selection (deterministic max-cover problem)

Page 20: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Recap: TIM Algorithm• (1-1/e-ε)-approximation with high

probability– Same as greedy, modulo probabilistic part

• Orders of magnitude faster than Greedy + Monte Carlo simulations

• Scalable to billion-edge graphs• Applies to a large family of stochastic

propagation models

Page 21: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Generalized RR-set and TIM Algorithms

• Works for any stochastic propagation models satisfying monotonicity and submodularity– Has (1-1/e-ε)-approximation with high probability

• General RR-set (in a deterministic possible world): u belongs to the RR-set rooted at v if the singleton seed set {u} can activate v – Note difference from “reaching”– Random RR-set: root v is sampled uniformly at

random from the graph

Page 22: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

RR-set generation for SIM (RR-SIM)

• Problem definition and submodular setting– Fix B-seed set, find A-seed set (size k)– A is complemented by B: q(A|0) <= q(A|B)– B is indifferent to A: q(B|0) = q(B|A)

• Phase 1: Forward Labeling: Start from B-seed set, label node status w.r.t. B

• Phase 2: Backward BFS (details next)

Page 23: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Phase 2: Backward BFS• Randomly choose root v from the graph• Enqueue v into a FIFO queue Q• Until empty, repeatedly dequeue from Q• Let’s say we get a node u from Q• Enqueue u’s in-neighbours (with edge test)

if either is true– u is B-adopted and α(A,u) <= q(A|B)– u is not B-adopted and α(A,u) <= q(A|0)

Page 24: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

RR-Set generation for CIM (RR-CIM)

• Given A-seed set, find best complementing B set• Cross-submodularity holds q(B|A) = 1• Forward Labeling: Start from A-seed set, identify

nodes can be A-adopted potentially

• Backward BFS: Two passes required

Page 25: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Sandwich Approximation• Given any non-submodular set functions,

how to leverage submodular maximization (e.g., greedy, local search) to achieve provable approximation guarantees?

• Answer:– Derive upper/lower bound submodular functions

(“sandwiched”)– Use the best of the three solutions, which gives

a data-dependent approximation ratio

Page 26: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Sandwich Approximationnon-submodular, function wewant to maximize

lower bound, submodular

upper bound, submodular

Page 27: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Remarks• Applicable to any non-submodular function

maximization• If monotone, run Greedy on the upper

bound, lower bound, and the actual function

• If non-monotone, run Local Search• Upper/lower bound should be reasonably

tight to be meaningful

Page 28: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Experiments: Datasets

Also have synthetic dataset up to 1 million nodes

Page 29: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Learning Global Adoption Probabilities

Dataset: Flixster• Signals for adoption: rated a movie• Signals for informed: “Want to See”, “Not Interested”

Page 30: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Effects of εin General TIM algorithm: Tradeoff between seed set quality and running time

Page 31: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

SIM experiments: spread

Page 32: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

CIM experiments: spread

Page 33: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Running time

Page 34: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Sandwich Approximation Bounds

Page 35: From Competition to Complementarity: Comparative Influence Diffusion and Maximization

Thank you!

See you in VLDB’16!