kaggle facebook recruiting contest

5
Kaggle Facebook Recruiting Contest

Upload: curt3n5

Post on 12-Apr-2017

230 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Kaggle facebook recruiting contest

Kaggle Facebook Recruiting Contest

Page 2: Kaggle facebook recruiting contest

THE DATA

• Data set: 7 million + stack overflow posts with tags

• Training set: 2 million stack overflow posts without tags

• Goal: supply tags for training set stack overflow posts

Page 3: Kaggle facebook recruiting contest

ATTACK 1

• Build Baysian classifiers for each tag in training set

• 1) tried applying entire posts to each classifier– Classifiers became too large and slow

• 2) Apply POS tagger to posts and only use nouns to train classifiers– Classifiers became too large and slow

Page 4: Kaggle facebook recruiting contest

ATTACK 2

• Text search of training set posts for list of high frequency training set tags

• 1) Simple application caused too many false positives

• 2) Finally rated each tag in list based on false positive to positive ration and removed problematic tags from list

Page 5: Kaggle facebook recruiting contest

284/367 - hoorah