learning from rational behavior: predicting solutions to ...ryrogers/learnedge_poster.pdfi store...

Learning from Rational Behavior: Predicting Solutions to Unknown Linear Programs Shahin Jabbari, Ryan Rogers, Aaron Roth, Steven Wu Motivating Example: Learning from Revealed Preferences Unknown Constraints Problem: I Store owner sees the same customer whose utility over d goods u i : R d → R is u i ( x x x )=v · x x x is known, but the customer has a set of unknown constraints. I At each day t , known prices are posted over the d goods, p t ∈ R d + and we know the budget of the customer b t . I Predict the feasible bundle of goods that maximizes the customer’s utility. Unknown Objective Problem: I Each day t a subset of buyers S t ⊆ [n ] enter a store, where each person has an unknown utility over d goods u i : R d → R where u i ( x x x )=v i · x x x . I Predict the feasible bundle of goods in known P t that maximizes welfare of S t . General Problem: Learning from Rational Behavior I At each round t , an agent optimizes a linear objective function subject to linear constraints (i.e. solves a linear program) I Predict agent’s response: the solution of an unknown linear program given partial feedback either about the constraints or the objective Mathematical Model Unknown Constraints Problem: v , (p t , b t ) known, (A, b) unknown x (t ) = argmax x v · x s.t. Ax ≤ b p t · x ≤ b t Unknown Objective Problem: S t , P t known, (v 1 , ··· , v n ) unknown x (t ) = argmax x X i ∈S t v i · x s.t. x ∈P t Mistake Model: For day t =1, 2,... I Predict ˆ x (t ) , then see optimal x (t ) I If ˆ x (t ) 6=x (t ) then mistake at round t . I Want to minimize total # of mistakes. Assumptions I Coefficients of the constraint matrix A can be represented with N bits of precision I Constraint matrix A satisfies standard non-degeneracy assumptions Unknown Constraint Problem: Main Algorithm - LearnEdge We give an algorithm LearnEdge with mistake bound and running time polynomial in the number of edges of polytope Ax ≤ b, dimension d and precision N I Intuition: Learn the feasible region on the edges of the polytope Ax ≤ b F e Q e 0 M e 0 Y e 0 Y e 1 M e 1 Q e 1 } } } } } F e : feasible region Q e : questionable region Y e : identified infeasible region I Apply a halving algorithm on each edge of the unknown polytope - with each mistake, make progress towards learning the unknown feasible region I Mistake bound that is polynomial in d when the number of rows of A is d + O (1) Unknown Constraints Problem: Lower Bound and Extension Lower Bound: Dependence on Bits of Precision N I A lower bound of N on the number of mistakes of any algorithm when d ≥ 3 I For d =1 and 2 there exists efficient algorithms with finite mistake bounds independent of N Extension: Stochastic Setting I What if the constraints are chosen from an unknown distribution, rather than adversarially? . We give an efficient algorithm LearnHull with a mistake bound polynomial in the dimension d and the number of edge of polytope Ax ≤ b, but with no dependence on the precision N . Simply predict the point with the highest objective value from the convex hull of previously observed solutions, so it never makes infeasible guesses and the feasible region will always expand upon making a mistake Unknown Objective Problem: Main Algorithm We give an algorithm LearnEllipsoid with mistake bound and running time polynomial in the dimension d and precision N I We leverage the ELLIPSOID algorithm. I Start with an ellipsoid containing the set of all possible vectors v that parametrize the unknown objective function I Each mistake provides a separating hyperplane for the ELLIPSOID to update I Number of mistakes LearnEllipsoid makes is bounded by the number of updates of the ELLIPSOID algorithm. NIPS 2016 University of Pennsylvania

Upload: others

Post on 01-Feb-2021

1 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Learning from Rational Behavior:Predicting Solutions to Unknown Linear Programs

Shahin Jabbari, Ryan Rogers, Aaron Roth, Steven Wu

Motivating Example: Learning from Revealed Preferences

Unknown Constraints Problem:I Store owner sees the same customer

whose utility over d goods ui : Rd → Ris ui(xxx) = v · xxx is known, but thecustomer has a set of unknownconstraints.

I At each day t, known prices are postedover the d goods, pt ∈ Rd+ and we knowthe budget of the customer bt.

I Predict the feasible bundle of goods thatmaximizes the customer’s utility.

Unknown Objective Problem:I Each day t a subset of buyers St ⊆ [n]

enter a store, where each person has anunknown utility over d goodsui : Rd → R where ui(xxx) = vi · xxx .

I Predict the feasible bundle of goods inknown Pt that maximizes welfare of St.

General Problem: Learning from Rational Behavior

I At each round t, an agent optimizes a linear objectivefunction subject to linear constraints (i.e. solves alinear program)

I Predict agent’s response: the solution of an unknownlinear program given partial feedback either about theconstraints or the objective

Mathematical Model

Unknown Constraints Problem:v, (pt, bt) known, (A, b) unknown

x(t) = argmaxx

v · xs.t. Ax ≤ b

pt · x ≤ bt

Unknown Objective Problem:St, Pt known, (v1, · · · , vn) unknown

x(t) = argmaxx

∑i∈St

vi · x

s.t. x ∈ Pt

Mistake Model:

For day t = 1, 2, . . .I Predict x̂(t), then see optimal x(t)

I If x̂(t) 6= x(t) then mistake at round t.I Want to minimize total # of mistakes.

Assumptions

I Coefficients of the constraint matrix A can be represented with N bits of precisionI Constraint matrix A satisfies standard non-degeneracy assumptions

Unknown Constraint Problem: Main Algorithm - LearnEdge

We give an algorithm LearnEdge with mistake bound and running time polynomialin the number of edges of polytope Ax ≤ b, dimension d and precision N

I Intuition: Learn the feasible region on the edges of the polytope Ax ≤ b

FeQe0

Me0

Ye0 Ye

1

Me1

Qe1

}} }}} Fe: feasible regionQe: questionable regionYe: identified infeasible region

I Apply a halving algorithm on each edge of the unknown polytope - with each mistake,make progress towards learning the unknown feasible region

I Mistake bound that is polynomial in d when the number of rows of A is d + O(1)

Unknown Constraints Problem: Lower Bound and Extension

Lower Bound: Dependence on Bits of Precision N

I A lower bound of N on the number of mistakes of anyalgorithm when d ≥ 3

I For d = 1 and 2 there exists efficient algorithms withfinite mistake bounds independent of N

Extension: Stochastic Setting

I What if the constraints are chosen from an unknown distribution, rather thanadversarially?. We give an efficient algorithm LearnHull with a mistake bound polynomial in the

dimension d and the number of edge of polytope Ax ≤ b, but with no dependenceon the precision N

. Simply predict the point with the highest objective value from the convex hull ofpreviously observed solutions, so it never makes infeasible guesses and the feasibleregion will always expand upon making a mistake

Unknown Objective Problem: Main Algorithm

We give an algorithm LearnEllipsoid with mistake bound and running timepolynomial in the dimension d and precision N

I We leverage the ELLIPSOID algorithm.I Start with an ellipsoid containing the set of all possible

vectors v that parametrize the unknown objective functionI Each mistake provides a separating hyperplane for the

ELLIPSOID to updateI Number of mistakes LearnEllipsoid makes is bounded

by the number of updates of the ELLIPSOID algorithm.

NIPS 2016 University of Pennsylvania