adrian colyer cs research for practitioners · 2018. 2. 8. · 23 a dirty dozen: twelve common...

60

Upload: others

Post on 17-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17
Page 2: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

CS Research for practitioners:lessons from The Morning Paper

Adrian Colyer

@adriancolyer

Page 3: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

blog.acolyer.org

650FoundationsFrontiers

Page 4: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Image copyright: iqoncept / 123RF Stock Photo

Page 5: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

5

One hot / 1-of-N

Page 6: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

6

Distributed representation

Page 7: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Finding meaning in context

7

Page 8: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

8

method

for

high

quality

learning

Page 9: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

9

learning

method

for

high

quality

Page 10: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Vector offsets

10

Page 11: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

King - Man + Woman = ?

11

Page 12: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

More examples

12

Relationship Example 1 Example 2 Example 3

France - Paris Italy: Rome Japan: Tokyo Florida: Tallahassee

Einstein - scientist

Messi: midfielder

Mozart: violinist Picasso: painter

big - bigger small: larger cold: colder quick: quicker

Czech + currency = KorunaVietnam + capital = HanoiGerman + airlines = LufthansaRussian + river = Volga

Page 13: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Papers so far...

13

● Efficient estimation of word representations in vector space, Mikolov et al. 2013

● Distributed representations of words and phrases and their compositionality, Mikolov et al. 2013

● Linguistic regularities in continuous space word representations, Mikolov et al. 2013

● word2vec parameter learning explained, Rong 2014● word2vec explained: deriving Mikolov et al’s negative sampling

word-embedding method, Goldberg & Levy 2014● See also: GloVe: Global vectors for word representation,

Pennington et al. 2014

Page 14: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

14

Word Word Word Word SentenceRelation (table)

Document

Using word embedding to enable semantic queries on relational databases, Bordawekar & Shmeuli, DEEM’17

Page 15: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Find similar customers based on purchased items

15

SELECT X.custID, X.name, Y.custID, Y.name,similarityUDF(X.purchase, Y.purchase) AS simFROM sales X, sales YsimilarityUDF(X.purchase, Y.purchase) > 0.5ORDER BY X.name, simLIMIT 10

Page 16: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Customers that have purchased allergenic items

16

SELECT X.number, X.name, similarityUDF(X.purchase, ‘allergenic’) AS simFROM sales XsimilarityUDF(X.purchase, ‘allergenic’) > 0.3ORDER BY X.name, simLIMIT 10

Page 17: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

17

Accelerating innovation through analogy mining, Hope et al., KDD’17

Near purpose,Far mechanism.

Page 18: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

18 Image Copyright: ververidis / 123RF Stock Photo

“there is rich meaning incontext”

Page 19: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Are these ideas actually any good?

19

Page 20: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

20

“despite having data, the number of companies that successfully transform into data-driven organisations stays low, and how this transformation is done in practice is little studied.”

Image Copyright: everythingpossible / 123RF Stock Photo

Page 21: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

21

The evolution of continuous experimentation in software product development, Fabijan et al., ICSE’17

Image credit: Martin Fowler, “Microservices prerequisites”

Agile,Lean,CI,CD, [2-way exchange]

CE

Continuous experimentation

Page 22: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

22

Crawl Walk Run Fly

Tech.

Org.

Biz. OEC

Engineering team self-sufficiency

Experimentation team role

MetricsPlatformPervasiveness

Page 23: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

23

A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Logs: debug -> signalsSignals -> metrics

Data QualityMetrics

GuardrailMetrics

Local feature & DiagnosticMetrics

OECMetrics

Page 24: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

24

Seven rules of thumb for website experimenters, Kohavi et al., KDD’14

Page 25: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

25

Page 26: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

26

“Any sufficiently complex system acts

as a black box when it becomes easier

to experiment with than to

understand. Hence, black-box

optimization has become increasingly

important as systems become more

complex.”

Page 27: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

27

Google Vizier: a service for black-box optimization, Golovin et al., KDD’17

Image credit: https://pixabay.com

(nd)

f: X → ℝ

Page 28: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

28

Page 29: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

29

Page 30: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

30

TFX: A TensorFlow-based production scale machine learning platform, Baylor et al., KDD’17

Page 31: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

31

ActiveClean: Interactive data cleaning for statistical modeling, Krishnan et al., VLDB’16

Page 32: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

32

Neural Architecture Search with reinforcement learning, Zoph et al., ICLR’17

Page 33: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

33

Learning transferable architectures for scalable image recognition, Zoph et al., ArXiv’17

Page 34: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

34

Page 35: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

35

Neurosurgeon: collaborative intelligence between the cloud and the mobile edge, Kang et al., ASPLOS’17

Page 36: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

36

Page 37: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

37

Distributed deep neural networks over the cloud, the edge, and end devices, Teerapittayanon et al., ICDCS’17

Page 38: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

38

Page 39: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

39Image Copyright: forplayday / 123RF Stock Photo

“Planetary scale computer systems beyond our human understanding are continuously sensing, experimenting, learning, and optimising”

Page 40: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

40

European Union regulations on algorithmic decision making and a “right to explanation”, Goodman & Flaxman, 2016

Page 41: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

41

Practical black-box attacks against deep learning systems using adversarial examples, Papernot et al., CCS’17

Page 42: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

42

Universal adversarial perturbations, Moosavi-Dezfooli et al., CVPR’17

Page 43: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

43

Adversarial examples for evaluating reading comprehension systems, Jia & Liang, EMNLP’17

Page 44: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

44

IoT goes nuclear: creating a ZigBee chain reaction, Ronen et al., IEEE Security & Privacy 2017

Page 45: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

45

“What we demonstrate in this paper is that

even IoT devices made by companies with

deep knowledge of security, which are backed

by industry standard cryptographic

techniques, can be misused by hackers and

rapidly cause city-wide disruptions which are

very difficult to stop.”

Page 46: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

46

CLKSCREW: Exposing the perils of security-oblivious energy management, Tang et al., USENIX Security 2017

Page 47: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

47Image Copyright: sepavo / 123RF Stock Photo

Page 48: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

48

Page 49: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

49

REM: Resource-efficient mining for blockchains, Zhang et al., USENIX Security 2017

Page 50: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

50

I’m just building a webapp! Does any of this research stuff apply to me?

Page 51: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

51

Feral concurrency control: an empirical investigation of modern application integrity, Bailis et al., SIGMOD’15

“By shunning decades of work on native database

concurrency control solutions, Rails has developed a set of

primitives for handling application integrity in the

application tier—building, from the underlying database

system’s perspective, a feral concurrency control system.”

Page 52: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

52

ACIDRain: concurrency-related attacks on database backed web applications, Warszawski & Bailis, SIGMOD’17

Page 53: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

53

12 eCommerce apps

60% top 1M Commerce sites

22 vulnerabilities

2 hours or less to craft an exploit for each

Page 54: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

54

Thou shalt not depend on me: analysing the use of outdated JavaScript libraries on the web, Launinger et al., NDSS’17

37% vulnerablejQuery -> 36.7%, Angular -> 40.1%

Page 55: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

55

To type or not to type: quantifying detectable bugs in JavaScript, Gao et al., ICSE’17

Page 56: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

Wrapping Up

56

Page 57: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

57

Welcome to the crazy,

wonderful, exciting,

sometimes terrifying, but

always fascinating world of

computer science research!

Page 58: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

A new paper every weekdayPublished at http://blog.acolyer.org.01Delivered Straight to your inboxIf you prefer email-based subscription to read at your leisure.02Announced on TwitterI’m @adriancolyer.03Go to a Papers We Love MeetupA repository of academic computer science papers and a community who loves reading them.04Share what you learnAnyone can take part in the great conversation.05

Page 59: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17

THANK YOU !@adriancolyer

Cartoon images credit: Bitmoji

Page 60: Adrian Colyer CS Research for practitioners · 2018. 2. 8. · 23 A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments, Dmitriev et al., KDD’17