predicting a correct program in pbe rishabh singh, microsoft research sumit gulwani, microsoft...
TRANSCRIPT
Predicting a Correct Program in PBE
Rishabh Singh, Microsoft ResearchSumit Gulwani, Microsoft Research
Programming By Examples
IntuitiveNaturalAccessible
Ambiguity!
Excel Forums
300_w1_aniSh_c1_b w1
=MID(“300_w1_aniSh_c1_b”,5,2)
300_w30_aniSh_c1_b w30
=MID($B:$B,FIND(“_”,$B:$B)+1, FIND(“_”,REPLACE($B:$B,1,FIND(“_”,$B:$B),””))-1)
Excel Forums
FlashFill [Gulwani POPL2011][Gulwani,Harris,Singh CACM 2012]
DSL
VSAProgram
Heuristics
Benchmarks
DSL
VSAProgram
Ranking
Benchmarks
Handling AmbiguityInput Output
Rick Rashid Mr. Rick
Satya Nadella
Prefer non-constantsInput Output
Rick Rashid Mr. Rick
Satya Nadella Ms. Satya
Prefer smaller substrings as constants
Prefer smaller constantsInput Output
Satya Nadella S. Nadella
Bill Gates
2nd word, last word, 2nd capital followed by 2nd lowercase string….
Machine Learning for
Ranking
“With great power comes great responsibility.”
Labelled Training Data
Machine Learning Algorithm
Efficient Ranking Algorithm
Three Challenges
Training Data Generation
Input Output
Rick Rashid Mr. Rashid
Satya Nadella Mr. Nadella
Peter Lee Mr. Lee
Structuring Hypothesis Space with Sharing in Version-space
Associative Expressions
Fixed-arity Expressions
f(e1, f(e2, f(e3, e4)))
f(e1, e2, e3, e4)
DAG-based sharing
Set-based sharing
Ranking Function f(p)
Assume Linear Functionf(p) = w1* f1 + w2*f2 + … + wk*fk
Learning To Rank
Logistic RegressionListwise Approach
Didn’t work well Too strong a constraint
All relevant pages over irrelevant
Training PhaseInput Output
Rick Rashid Mr. Rick
Satya Nadella Mr. Satya
Peter Lee Mr. Lee
Lower 1st uppercase letterConstant “r”Lower 2nd upper case letter….
Goal: Find ranking function f(p) over program features that ranks positive programs higher than negative programs
Learn DAGs
0 1 2 4 83 5 6 7𝛾1
𝛾50
𝛾 2 𝛾 3 𝛾 4 𝛾5 𝛾 6 𝛾7 𝛾 8
𝛾 9 𝛾10𝛾11 𝛾12 𝛾13
Rick Rashid Mr. Rashid
Satya Nadella Mr. Satya
0 1 2 4 83 5 6 7𝛾1
𝛾50
𝛾 2 𝛾 3 𝛾 4 𝛾5 𝛾 6 𝛾7 𝛾 8
𝛾11 𝛾12 𝛾13
Intersect DAGs
∩
≡
Rick Rashid Mr. Rick
Satya Nadella Mr. Satya
Assign Positive Labels
∩
≡
Rick Rashid Mr. Rick
Satya Nadella Mr. Satya
Assign Negative Labels
∩
≡
Rick Rashid Mr. Rick
Satya Nadella Mr. Satya
Rick Rashid Mr. Rick
Satya Nadella Mr. Satya
Learn ranking function f(p) that ranks programs higher than programs.
Training Phase
Positive Programs
Negative Programs
Rank any positive program over all negative programs
Hierarchical Ranking
Atomic Expression
Substring Expression
Concat Expression
Frequency of tokens, context, neighborhood,…
Length of substring, input, output, constant,…
Number of Arguments, sum, max, min, prod
Evaluation
175 benchmarks30-70 train-test partitionBaseline (Occam’s razor): Smallest & Simplest programs
Ranking Evaluation
LearnRank learns from 1 example for 79% benchmarks
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 1210
1
2
3
4
5
6
Number of Examples for Learning the Test TaskBaseline LearnRank
Benchmarks
Num
ber o
f Exa
mpl
es
Efficiency of Ranking
Ranking for PBEMachine Learning + Synthesis
VSA Sharing FormalizationEfficient Features & Algorithms
General Loss Function for PBE
Thanks! [email protected]