generative grading cs398 - web.stanford.edu - generativegrading.pdfgenerative grading cs398. four...

35
Generative Grading CS398

Upload: others

Post on 12-Jul-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Generative GradingCS398

Page 2: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

Checkout this assignment on code.org

Page 3: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

The Code.org Dataset

● Students learning nested loops

● 50k students with 1.5 million submissions to a curriculum of 8 exercises.

● 800 human labels across 2 of the exercises.

P4

Page 4: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

https://studio.code.org/s/course4/stage/10/puzzle/4

Page 5: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

Can you write an algorithm to give auto-feedback for this assignment?

Page 6: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

1E-6

1E-5

1E-4

1E-3

1E-2

1E-1

1E+0

1E+0 1E+1 1E+2 1E+3 1E+4 1E+5

Prob

abili

ty M

ass

(log

scal

e)

Rank Order (log scale)

Edit distance is meaningless, Code is Zifian

100

10-1

10-2

10-3

10-4

10-5

10-6

100 101 102 103 104 105

f(k) =1/ks

PNn (1/ns)

Code Zipf Plot

Exponential combination of decisions. Super fat tailed. Everything looks unique

Page 7: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

1E-6

1E-5

1E-4

1E-3

1E-2

1E-1

1E+0

1E+0 1E+1 1E+2 1E+3 1E+4 1E+5

Prob

abili

ty M

ass

(log

scal

e)

Rank Order (log scale)

Hard Problem

1 million unique solutions to programming Linear Regression

Brute force solution?

WWW 2014

100

10-1

10-2

10-3

10-4

10-5

10-6

100 101 102 103 104 105

f(k) =1/ks

PNn (1/ns)

Code Zipf Plot

Page 8: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Hard Problem

1 million unique solutions to programming Linear Regression

Brute force solution?

WWW 2014

1 100 100001 100 10000

1 100 10000

Code.org A Stanford A

Coursera B

1 100 10000

Stanford C

Code Zipf Plot

Page 9: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Evaluation Task

Page 10: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Stanford TAs label 800 submissions

import code.org.*;

public class MySoln {public void run() {move(50);for(int i=0; i<4; i++){if(frontIsClear()) {turnLeft(90);

}for(int j=0; j<i; i++){move(i * 20);turnRight(120);move(10);

}}

}}

Evaluation Task

Page 11: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Traditional Deep Learning Doesn’t Work

Old Gaurd

Humans

Label student code

Page 12: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Old Gaurd

HumansDeep Learning

Inaccurate, Uninterpretable, and Data Hungry

Label student coderun

cond body

putBeeper

putBeeper move

while

Raw

Pre

cond

ition

Raw

Pos

tcon

ditio

n

Multiply Program Matrix… …Encoder Decoder

Prediction LossAutoencoding Loss

Decoder

Piech et Al, ICML 2014

Page 13: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Old Gaurd

HumansDeep Learning

Inaccurate, Uninterpretable, and Data Hungry

Label student code

Page 14: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Old Gaurd

HumansDeep Learning

Data Hungry

Label student code

We need one shot learning

We needverifiability

Page 15: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

We need a strategy! How are we going to solve this problem???

Page 16: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

[Suspense]

Page 17: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

Taste of the future of AI

Page 18: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Humans Don’t Need Much Data

Single training example:

Test set:

Page 19: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Bayesian Program Learning

Page 20: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Lake et al. Human-level concept learning through probabilistic program induction

Bayesian Program Learning

Page 21: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Imagine Students

• Struggle with double for loops

• Confuses logic for deleting bricks

Page 22: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

• Strugglewithdoubleforloops

• Confuseslogicfordeletingbricks

Imagine Students

A students“ability”

Infer ability and choices from code

A students“choices”

The resulting code

This is easy and exponential This is hard and linear

Page 23: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Old Gaurd

HumansDeep Learning

Label student code

Generative Understanding

• Strugglewithdoubleforloops

• Confuseslogicfordeletingbricks

Page 24: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

0.00.10.20.30.40.50.60.70.80.91.0

Feed

back

F1

Scor

e

First Problem (P1)Last Problem (P8)

Generative Understanding

Old Gaurd

Humans

Label student code

Deep Learning

Zero Shot

Learning

• Strugglewithdoubleforloops

• Confuseslogicfordeletingbricks

Page 25: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

In simple terms: generate a ton of our own labelled examples. It’s an unreasonably

good way to start to hit high performance

Page 26: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

How many decisions with 3 options to get to10K labelled programs?

Page 27: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

How do I write said grammars??

Page 28: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

ideaToText

Page 29: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Teachers Articulate N misconceptionsThis is code for a single decision point

Give a name to the choice that the student is making

How do those choices translate into grades?

What does the code look like? Often evokes other decision points

1.

2.

3.

4.

Page 30: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Starter Code in Blocks

Solution in Blocks

Starter Code in PsuedoCode

Solution in PsuedoCode

For(???, ???, ???) {}

For(15, 300, 15) {Repeat(4) {

Move(Counter)TurnLeft(90)

}}

Page 31: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

First challenge: no solution or yes solution.

Second challenge: draw square using repeat or unrolled

Page 32: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Four Prototypical Trajectories

Pro Tips

Page 33: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Keep your grammar clean!

One idea per class

Page 34: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Have a knowledge prior

Decisions are not independent

Page 35: Generative Grading CS398 - web.stanford.edu - GenerativeGrading.pdfGenerative Grading CS398. Four Prototypical Trajectories Checkout this assignment on code.org. The Code.orgDataset

Be clever in your use of outcomes

Template Choices